Analysis and Evaluation of Measures of Retrieval Performance
Award: |
|
NSF IIS-0534482 |
PI: |
|
Javed A. Aslam |
Institution: |
|
Northeastern University |
Summary
Search engines and other information retrieval technologies are
critical in the digital age. The goal of this project is to develop
novel paradigms for analyzing and efficiently evaluating retrieval
performance, with an eye toward fostering and enabling research
leading to better search engines and other retrieval technologies.
Two novel frameworks are proposed: (1) an information-theoretic
framework within which one can quantifiably assess what various
measures of retrieval performance are measuring and (2) a statistical
framework within which one can efficiently estimate these measures of
retrieval performance. The former provides a theoretical underpinning
for retrieval evaluation and analysis; the latter provides a practical
methodology for efficiently evaluating search engines on a large
scale. Each will foster and enable research leading to better search
engines and search technology.
Current Personnel
Former Personnel
- Evangelos Kanoulas (now a post-doc at the University of Sheffield, UK)
- Emine Yilmaz
(now at Microsoft Research, Cambridge UK)
- Alan Feuer (founded Blossom Software)
- Olen Zubaryeva (pursuing a PhD in Switzerland)
- Carlos Rei
Publications
-
Implementing and Evaluating Phrasal Query Suggestions for Proximity Search
Information Systems, 34(8):711-723, December 2009.
-
Empirical Justification of the Gain and Discount Function for nDCG
In Proceedings of the Eighteenth ACM Conference on Information and Knowledge Management (CIKM), pages 611-620. ACM Press, November 2009.
-
Modeling the Score Distributions of Relevant and Non-relevant Documents
In Proceedings of the 3rd International Conference on Theory in Information Retrieval (ICTIR), pages 152-163. Lecture Notes in Computer Science, Vol. 5766. Springer, September 2009.
-
Document Selection Methodologies for Efficient and Effective Learning-to-rank
In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 468-475. ACM Press, July 2009.
-
If I Had a Million Queries
In Advances in Information Retrieval: 31st European Conference on IR Research (ECIR), pages 288-300. Lecture Notes in Computer Science, Vol. 5478. Springer-Verlag, April 2009.
-
Million Query Track 2007 Overview
In The Sixteenth Text REtrieval Conference Proceedings (TREC 2007), pages 85-104. National Institute of Standards and Technology, December 2008. NIST Special Publication SP 500-274.
-
The Hedge Algorithm for Metasearch at TREC 2007
In The Sixteenth Text REtrieval Conference Proceedings (TREC 2007). National Institute of Standards and Technology, December 2008. NIST Special Publication SP 500-274.
-
Estimating Average Precision When Judgments are Incomplete
Knowledge and Information Systems, 16(2):173-211, August 2008.
-
Evaluation Over Thousands of Queries
In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 651-658. ACM Press, July 2008.
-
A New Rank Correlation Coefficient for Information Retrieval
In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 587-594. ACM Press, July 2008.
-
A Simple and Efficient Sampling Method for Estimating AP and NDCG
In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 603-610. ACM Press, July 2008.
-
Empirical Justification of the Discount Function for nDCG [abstract]
In Proceedings of the SIGIR 2008 Workshop: Beyond Binary Relevance: Preferences, Diversity and Set-Level Judgments, page 6. July 2008.
-
Inferring Document Relevance from Incomplete Information
In Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management (CIKM), pages 633-642. November 2007.
-
Evaluation of Phrasal Query Suggestions
In Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management (CIKM), pages 841-848. November 2007.
-
Million Query Track 2007 Overview
In The Sixteenth Text REtrieval Conference Proceedings (TREC 2007). National Institute of Standards and Technology, November 2007. NIST Special Publication SP 500-274.
-
Estimating Average Precision When Judgments are Incomplete
Knowledge and Information Systems, August 2007.
-
The Hedge Algorithm for Metasearch at TREC 2006
In The Fifteenth Text REtrieval Conference Proceedings (TREC 2006). National Institute of Standards and Technology, September 2007. NIST Special Publication SP 500-272.
-
Query Hardness Estimation Using Jensen-Shannon Divergence Among Multiple Scoring Functions
In Advances in Information Retrieval: 28th European Conference on IR Research (ECIR 2007), pages 198-209. Lecture Notes in Computer Science, Vol. 4425. Springer-Verlag, 2007.
-
Estimating Average Precision with Incomplete and Imperfect Judgments
In Proceedings of the Fifteenth ACM International Conference on Information and Knowledge Management (CIKM), pages 102-111. ACM Press, November 2006.
-
A Statistical Method for System Evaluation Using Incomplete Judgments
In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 541-548. ACM Press, August 2006.
-
Inferring Document Relevance via Average Precision
In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 601-602. ACM Press, August 2006.
statAP at TREC
infAP at TREC
Acknowledgment and Disclaimer
This material is based upon work supported by the National Science
Foundation under Grant No. IIS-0534482. Any opinions, findings and
conclusions or recomendations expressed in this material are those of
the author(s) and do not necessarily reflect the views of the National
Science Foundation (NSF).