©2005 Felleisen, Proulx, et. al.
Goals
This assignment consists of a small program that uses interfaces and classes either from Java's standard libraries, or from our earlier labs and assignments. The goal is to give you a bit of design freedom: You get to decide which parts of the standard libraries, or which classes and interfaces we already designed are the most suitable to use. If you design well, this assignment should be fairly straightforward.
The second goal is to complete the introduction to Java program design
standards. The program you produce will eventually use the JUnit test
tools and will include documentation in the style that allows you to
produce Javadoc documentation for your program.
Hints
Some or all of the following interfaces and classes are likely to
prove useful. In the java.lang
package: Comparable
,
Iterator
, List
, Map
, Set
,
Collections
.
The Application
Have you ever wondered about the size of Shakespeare's vocabulary? For
this assignment you will write a program that reads its input from a
text file and lists the words that occur most frequently, together
with a count of how many different words occur in the file. If this
program were to run on a file that contains all of Shakespeare's
works, it would tell you the approximate size of his vocabulary, and
how often he uses the most common words.
Hamlet, for example, contains about 4542 distinct words, and the word "king" occurs 202 times.
Start by downloading the file HW11.zip and making an Eclipse project
HW11 that contains these files. Add jpt.jar as a
Variable
to your project. Run the project, to make sure you have all pieces in
place. The main method is in the class Examples
.
You are given the file test.txt that contains the entire text of Hamlet and a file Week11.java that contains the code that generates the words from the file test.txt one at a time, via an iterator.
The classes Tester
and Examples
contain a test
harness similar to the
SimpleTestHarness
used in the previous two assignments, but
improved to catch exceptions raised whie running the tests. More about
this later...
Your tasks are the following:
Design the class Word
to represent one word of
Shakespeare's vocabulary, together with its frequency counter. The
constructor takes only one String
(for example the word
"king") and starts the counter at one. We consider one Word
instance to be equal to another, if they represent the same word,
regardless of the value of the frequency counter. That means that you
have to override the method equals()
as well as the method
hashCode()
.
Include in the class Word
an inner class that
implements the Comparator
interface, so that the words can
be sorted by frequencies. (Be careful!)
Include in the class Word
the method that allows you
to increment the counter (using mutation), and a method
toString
that prints
one line with the word and its frequency.
Design the class WordCounter that keeps track of all the words we have seen so far. It should include the following methods:
// records the Word objects generated by the given Iterator. void countWords (Iterator it) { ... } // How many different Words has this WordCounter recorded? int words() { ... } // Prints the n most common words and their frequencies. void printWords (int n) { ... }
Here are additional details:
countWords
consumes an iterator that generates the
words and builds the collection of the appropriate Word
instances, with the correct frequencies.
words
produces the total count of different words
that have been consumed.
printWords
consumes an integer n
and
prints the top n
words with the highest frequencies (using
the toString
method defined in the class Word
).
Of course, you need to test all methods as you are designing them. Design the tests in three stages:
For the class Word
use a technique similar to what was
done in the past two asisgnments, i.e. design a class
SimpleTests
that instantiates the class Tester
as
well as the necessary sample data and
collects all tests in a method void run()
. At the end of
this method it invokes either the testReport
or the
fullTestReport
method to report on the results.
When designing the class WordCounter
, upgrade to the
next level of the test harness. The class Tester
contains
the following driver for the tests:
// run the tests, accept the class to be tested as a visitor void runTests(Testable f) { this.n = 0; try { f.tests(this); } catch (Throwable e) { // catch all exceptions this.errors = this.errors + 1; console.out.println("Threw exception during test " + this.n); console.out.println(e); } finally { done(); } } // to be run after all tests have been performed public void done(){ if (this.errors > 0) console.out.print("Failed " + this.errors + " out of "); else console.out.print("Passed all "); console.out.println (this.n + " tests."); }
The class Examples
implements the Testable
interface
that contains just one method:
void tests(Tester t);
Inside of this method the class Examples
invokes the
appropriate test
methods on the instance t
of the
Tester
.
So we have a chicken and egg problem here. The class
Tester
wants to know what is the Examples
instance
that is running the tests, so that it can invoke the method
tests(Tester t)
defined in the Examples
class inside
of the Tester
's try
clause.
The class
Examples
in turn needs an instance of the class
Tester
so that it can invoke each test
method inside
of the method tests(Tester t)
.
The main gain is that every
invocation of the methods test
is wrapped inside of the
try
clause and if an exception is thrown, the error report
indicates which one of the tests failed.
The only thing you need to do is to include all your tests and the
needed sample data inside of
the tests(Tester t)
method in the class Examples
.
This prepares us for the third way of running tests, namely using JUnit - Java's standard test framework.
Introducing JUnit. You will now rewrite all your tests using the
JUnit. In the File menu select New then
JUnitTestCase. When the wizard comes up, select to include
the main
method, the constructor, and the setup
method. The tests for each of the methods will then become one test
case similar to this one:
/** * Testing the method toString */ public void testToString(){ assertEquals("Hello: 1\n", this.hello1.toString()); assertEquals("Hello: 3\n", this.hello3.toString()); }
We see that assertEquals
is basically the same as the
test
methods for our test harnesses, they just don't include
the name of the test. Try to see what happens when some of the tests
fail, when a test throws an exception, and finally, make sure that at
the end all tests succeed.
You may have noticed that the style in which we write documentation for this assignment has changed. When written in the well formatted javadoc style, the comments can used to generate web pages of documentation with cross-references and browsing capabilities. There are a few basic rules, the rest you should learn on your own, gradually, as you become more and more skilled Java programmers.
Here are comments to specify the name of the file, and the class definition:
/* * @(#)Word.java 28 March 28 2005 * */ /** * * <P><CODE>Word</CODE> represents one word and its * number of occurrences counted in the * <CODE>{@link WordCounter WordCounter}</CODE> class.</P> * * @see Comparable * * @author Viera K. Proulx */ public class Word implements Comparable {
The @author
and @see
identify the author and provide
a cross-reference to other classes as specified.
Each field in the class has its own comment:
/** * the frequency counter */ public int counter;
Each method has a comment that includes a separate line for each parameter as well as for the return value:
/** * Compare two <CODE>Object</CODE>s for equality * * @param obj the object to compare to * @return true if the two objects have the same contents */ public boolean equals(Object obj){
The @param
has to be followed by the identifier used for that
parameter. The <CODE> and </CODE> tags specify the formatting for the
document to be the teletype font for representing the code.
Eclipse helps you to write the documentation. If you start the comment
line with /**
and hit the return, the beginnings of remaining comment
lines are generated automatically, and you only need to add the
relevant information.
When you have finished all the documentation, select the item Generate Javadoc... in the Project menu. To see your web pages, just open the tab doc in the Package Explorer window under your project and double click on the index.html.