There will be five questions on the Final. They are described below.
Question 1. Google: Be sure you understand inverted (word) indexes, the basics of Page Rank and how they work together.Question 2. Recall and Precision: Be sure you understand Section 3.2.1 on Recall and Precision. In particular, practice drawing sets consisting of a number of items such as 1, 2, 3, etc. and then drawing the appropriate subset boundaries of them as in Figure 3.1.
Practice computing Recall and Precision (page 75) until you can do it correctly every time. Be able to explain what each is and how they differ in what they represent.
Question 3 and 4: The course email archives contain the URLs for the private locations of the two papers that I posted in connection with the Final Exam. This is a link to the note in the archive (your password required).
Question 3. The XTRACTOR summarization system (JCDL 2002 paper): Be familiar with the five components of the XTRACTOR system in Section 3.1 of the paper. You need to know the gist of their approach, with only a modest level of detail about the components (since they don't explain them as clearly as they should have!).
Question 4. Digital Photo Library Browsing (JCDL 2002 paper): Be familiar with the timestamp clustering of photos used in the Stanford hierarchical photo browser system and the Hierarchical Browser, their Figure 5.
Question 5. Evaluation of IR systems: Be able to discuss how and why human evaluations of query results are used in designing information retrieval systems.