Metasearch information fusion using linear programming

Gholam R. Amin; Ali Emrouznejad; Hamid Sadeghi

RAIRO - Operations Research (2012)

  • Volume: 46, Issue: 4, page 289-303
  • ISSN: 0399-0559

Abstract

top
For a specific query merging the returned results from multiple search engines, in the form of a metasearch aggregation, can provide significant improvement in the quality of relevant documents. This paper suggests a minimax linear programming (LP) formulation for fusion of multiple search engines results. The paper proposes a weighting method to include the importance weights of the underlying search engines. This is a two-phase approach which in the first phase a new method for computing the importance weights of the search engines is introduced and in the second stage a minimax LP model for finding relevant search engines results is formulated. To evaluate the retrieval effectiveness of the suggested method, the 50 queries of the 2002 TREC Web track were utilized and submitted to three popular Web search engines called Ask, Bing and Google. The returned results were aggregated using two exiting approaches, three high-performance commercial Web metasearch engines and our proposed technique. The efficiency of the generated lists was measured using TREC-Style Average Precision (TSAP). The new findings demonstrate that the suggested model improved the quality of merging considerably.

How to cite

top

Amin, Gholam R., Emrouznejad, Ali, and Sadeghi, Hamid. "Metasearch information fusion using linear programming." RAIRO - Operations Research 46.4 (2012): 289-303. <http://eudml.org/doc/222492>.

@article{Amin2012,
abstract = {For a specific query merging the returned results from multiple search engines, in the form of a metasearch aggregation, can provide significant improvement in the quality of relevant documents. This paper suggests a minimax linear programming (LP) formulation for fusion of multiple search engines results. The paper proposes a weighting method to include the importance weights of the underlying search engines. This is a two-phase approach which in the first phase a new method for computing the importance weights of the search engines is introduced and in the second stage a minimax LP model for finding relevant search engines results is formulated. To evaluate the retrieval effectiveness of the suggested method, the 50 queries of the 2002 TREC Web track were utilized and submitted to three popular Web search engines called Ask, Bing and Google. The returned results were aggregated using two exiting approaches, three high-performance commercial Web metasearch engines and our proposed technique. The efficiency of the generated lists was measured using TREC-Style Average Precision (TSAP). The new findings demonstrate that the suggested model improved the quality of merging considerably.},
author = {Amin, Gholam R., Emrouznejad, Ali, Sadeghi, Hamid},
journal = {RAIRO - Operations Research},
keywords = {Linear programming; search engine; metasearch; information fusion; information retrieval; linear programming},
language = {eng},
month = {11},
number = {4},
pages = {289-303},
publisher = {EDP Sciences},
title = {Metasearch information fusion using linear programming},
url = {http://eudml.org/doc/222492},
volume = {46},
year = {2012},
}

TY - JOUR
AU - Amin, Gholam R.
AU - Emrouznejad, Ali
AU - Sadeghi, Hamid
TI - Metasearch information fusion using linear programming
JO - RAIRO - Operations Research
DA - 2012/11//
PB - EDP Sciences
VL - 46
IS - 4
SP - 289
EP - 303
AB - For a specific query merging the returned results from multiple search engines, in the form of a metasearch aggregation, can provide significant improvement in the quality of relevant documents. This paper suggests a minimax linear programming (LP) formulation for fusion of multiple search engines results. The paper proposes a weighting method to include the importance weights of the underlying search engines. This is a two-phase approach which in the first phase a new method for computing the importance weights of the search engines is introduced and in the second stage a minimax LP model for finding relevant search engines results is formulated. To evaluate the retrieval effectiveness of the suggested method, the 50 queries of the 2002 TREC Web track were utilized and submitted to three popular Web search engines called Ask, Bing and Google. The returned results were aggregated using two exiting approaches, three high-performance commercial Web metasearch engines and our proposed technique. The efficiency of the generated lists was measured using TREC-Style Average Precision (TSAP). The new findings demonstrate that the suggested model improved the quality of merging considerably.
LA - eng
KW - Linear programming; search engine; metasearch; information fusion; information retrieval; linear programming
UR - http://eudml.org/doc/222492
ER -

References

top
  1. L. Akritidis, D. Katsaros and P. Bozanis, Effective rank aggregation for metasearching. J. Syst. Soft.84 (2011) 130–143.  
  2. G.R. Amin and A. Emrouznejad, An extended minimax disparity to determine the OWA operator weights. Comput. Ind. Eng.50 (2006) 312–316.  
  3. G.R. Amin and A. Emrouznejad, Finding relevant search engines results : a minimax linear programming approach. J. Oper. Res. Soc.61 (2010) 1144–1150.  Zbl1193.90157
  4. G.R. Amin and H. Sadeghi, Application of Prioritized Aggregation Operators in Preference Voting. Int. J. Intell. Syst.25 (2010) 1027–1034.  Zbl1202.91063
  5. R.A. Baeza-Yates and B. Ribeiro-Neto, Modern information retrieval : the concepts and technology behind search, 2nd edition. ACM Press Books (2010).  
  6. J. Bar-Ilan, M. Mat-Hassan and M. Levene, Methods for comparing rankings of search engine results. Comput. Netwo.50 (2006) 1448–63.  Zbl1095.68501
  7. S.K. Deka and N. Lahkar, Performance evaluation and comparison of the five most used search engines in retrieving web resources. Online Inf. Rev.34 (2010) 757–771.  
  8. E.D. Diaz, A. De and V. Raghavan, A comprehensive OWA-based framework for result merging in metasearch. Lect. Notes Comput Sci.3642 (2005) 193–201.  
  9. A. De, E.D. Diaz and V. Raghavan, A fuzzy search engine weighted approach to result merging for metasearch. Lect. Notes Comput Sci.4482 (2007) 95–102.  
  10. A. Emrouznejad, MP-OWA : The Most Preferred OWA Operator. Knowl-Based Syst.21 (2008) 847–851.  
  11. M. Farah and D. Vanderpooten, An outranking approach for rank aggregation in information retrieval. Proceedings of the 30th ACM SIGIR conference on Research and development in information retrieval. Amsterdam, The Netherlands (2007) 591–598.  
  12. E. Herrera-Viedma, J. Lopez Gijon, S. Alonso, J. Vilchez, C. Garcia and L. Villen, Applying Aggregation Operators for Information Access Systems : An Application in Digital Libraries. Int. J. Intell. Syst.23 (2008) 1235–1250.  Zbl1160.68312
  13. Y. Lu, W. Meng, L. Shu, C. Yu and K.L. Liu, Evaluation of Result Merging Strategies for Metasearch Engines. Lect. Notes Comput. Sci.3806 (2005) 53–66.  
  14. W. Meng, C. Yu and K.L. Liu, Building efficient and effective metasearch engines. ACM Comput. Surv.34 (2002) 48–89.  
  15. H. Sadeghi, Assessing metasearch engine performance. Online Inf. Rev.33 (2009) 1058–1065.  
  16. H. Sadeghi, Empirical challenges and solutions in construction of a high-performance metasearch engine. Online Inf. Rev.36 (2012) 713–723.  
  17. S. Shekhar, K.V. Arya, R. Agarwal and R. Kumar, A WEBIR Crawling Framework for Retrieving Highly Relevant Web Documents : Evaluation Based on Rank Aggregation and Result Merging Algorithms, International Conference on Computational Intelligence and Communication Networks (CICN) (2011) 83–88.  
  18. M. Shokouhi and L. Si, Federated Search. Found. Trends Inf. Retr. FTIR5 (2011) 1–102.  
  19. T.P.C. Silva, E.S. De Moura, J.M.B. Cavalcanti, A.S. Da Silva, M.G. De Carvalho and M.A. Gonçalves, An evolutionary approach for combining different sources of evidence in search engines. Inform. Syst.34 (2009) 276–289.  
  20. E. Voorhees, Overview of TREC 2002, in Proceedings of the 11th Text REtrieval Conference (TREC), Gaithersburg, MD, USA (2002) 1–15.  
  21. S. Wu, Applying statistical principles to data fusion in information retrieval. Expert Syst. Appl.36 (2009) 2997–3006.  
  22. S. Wu, Linear combination of component results in information retrieval. Data Knowl. Eng.71 (2012) 114–126.  
  23. S. Wu and F. Crestani, Data fusion with estimated weights, CIKM ’02 : Proceedings of the eleventh international conference on Information and knowledge management (2002) 648–651.  
  24. S. Wu, Y. Bi, X. Zeng and L. Han, The Experiments with the Linear Combination Data Fusion Method in Information Retrieval. Lect. Notes Comput. Sci.4976 (2008) 432–437.  
  25. J.T. Yao, V. Raghavan and Z. Wu, Web information fusion : A review of the state of the art. Inform Fusion9 (2008) 446–449.  
  26. X.S. Xie and G. Zhang, Study of Optimizing the Merging Results of Multiple Resource Retrieval Systems by a Particle Swarm Algorithm. International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC) (2011) 39–42.  
  27. S. Zhou, M. Xu and J. Guan, LESSON : A system for lecture notes searching and sharing over Internet. J. Syst. Soft.83 (2010) 1851–1863.  
  28. G.T. Zhou, K.M. Ting, F.T. Liu and Y. Yin, Relevance feature mapping for content-based multimedia information retrieval. Pattern Rec.45 (2012) 1707–1720 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.