IS1320 Sp03 Midterm review #1 - Prof. Futrelle
Midterm to be given on Thursday 8 May - Closed book/notes
This review document may be updated or augmented by another before the
Midterm is given. But it is a good start, if not the final word.
- Google: There will be a general question concerning Google,
which won't be too hard if you've read through Brin and Page's
"The Anatomy of a Large-Scale Hypertextual (Web) Search Engine".
This paper is available in literally hundreds, if not thousands
of places on the web,
so you will have no problem whatsoever in finding a copy to print
and read. Google for: brin page anatomy.
- The Page Rank algorithm: The notes here follow
the web page, "Link, Links Analysis and Page Rank Algorithm"
at
http://www.actonvision.com/GooglePageRank.html.
You should print out
that page and study it carefully in preparation for the Midterm.
Given a set of pages and pointers in the form "U points to X and Y", etc.
you will be asked to draw the corresponding diagram. Given the general formula
and the values of E and F in the equation:
I(Q) = E + F * (I(P1)/Op1 + ... + I(Pm)/OPm)
you should be able to estimate by inspection, which page will be ranked highest,
and which lowest and demonstrate this by iterating the equation once.
That is, you will need to do the numerical computation. It's so simple that
you shouldn't need a calculator, but you can use one if you like. (Points will
be deducted for reporting results to too many decimal places of precision.)
-
XML: You should understand the basics of XML: That there is a schema that
describes the structure of a class of XML documents. That OO classes can be
generated that are in one-to-one correspondence with the XML Schema, e.g., using
JAXB. The type of question you will be given is that you will be given an example
XML source document and then you should be able to explain that OO
classes corresponding to the XML elements can be created with the corresponding
set() and get() methods. for example, a document snippet such as:
<PersonInfo gender="male">
<name>Bob</name>
<age>66</age>
</PersonInfo>
could lead to operations such as info.getName(), info.setAge(39)
and info.getGender(),
where info is an instance of a PersonInfo class created to correspond
to the XML Schema. You should also be familiar with the concepts of
marshalling, unmarshalling and validation. A basic synopsis of these
terms and concepts is at
http://java.sun.com/xml/jaxb/users-guide/jaxb-java.html, which you
should also read, though you might find some equivalent information
somewhere that you prefer.
-
Inverted files/index and using for searching:
This is covered in Sec. 8.2 of the text.
I will give you a few tiny documents and ask you to construct
the inverted index and explain how it would be used in processing
a query. [more on this topic shortly]
Go to IS1320 home page.
Return to Prof. Futrelle's home page