The Skinny - IS1320 Information Retrieval - Spring 2003
Professor Futrelle
A constantly updated list of
important items and some odds and ends
Version of 24 March 2003
If perhaps you don't know what "The skinny" means, check out
Evan Morris' discussion of the phrase.
Contents
Making your web page private
"Logging in" to a web server
Simple Java code to download web pages
- #1. 3/12/03 Making your web page private
- Your web page for your course assignments should be set up
to only be accessible by us, Professor Futrelle and the teaching assistant.
This is easy enough to do on our Solaris systems and
is described in a
separate document.
- #2 3/15/03 "Logging in" to a web server
- Working in a terminal session you can execute the command
telnet www.ccs.neu.edu 80
This gets you to the HTTP (Web) server port on the system.
You then execute the command (using upper case as shown):
GET /home/futrelle/tiny.html HTTP/1.0
and enter return twice. What you will then see is
the reply from the web server shown below. This is basically
what a browser is doing when you type in a URL or click on
a link on a page. You can get the page in your browser by
clicking this URL:
http://www.ccs.neu.edu/home/futrelle/tiny.html
HTTP/1.1 200 OK
Date: Sat, 15 Mar 2003 18:48:18 GMT
Server: Apache/1.3.14 (Unix) mod_perl/1.24_01 mod_ssl/2.7.1 OpenSSL/0.9.6
Last-Modified: Tue, 31 Dec 2002 18:19:19 GMT
ETag: "15319e-28-3e11dfa7"
Accept-Ranges: bytes
Content-Length: 40
Connection: close
Content-Type: text/html
Hi there.
Connection closed by foreign host.
- #3 3/15/03 Simple Java code to download web pages
- Here is a Java program that you can study, compile
and use. It downloads certain information about a web
page and prints it out. The web page that's hardwired
into the code happens to be
http://www.oreilly.com.
Click this link
for the java source. You can then paste it
into a file and compile and run it. (Your browser may show you
the code or may download it as a file. If you have problems,
download
this copy
with the extension .txt, which should have
no problems.) With slight variations, this code can be used
to download the entire contents of a web page which can then
be analyzed for content or for additional links to other pages,
images, etc.
Go to IS1320 home page
Return to Prof. Futrelle's home page