Assignment 10
This assignment consists of two parts. In the first part you will solve the assigned problem. In the second part you will change all tests to tests that use the JUnit testing framework, and will also make sure the entire program is documented using the Javadoc documentation style. The WebCAT submission server will verify the completness of the documentation as well as the tests that use JUnit framework.
Work with Java Collections Framework classes HashMap, TreeMap.
Learn to define tests using the JUnit framework. Learn to provide documentation using Javadoc documentation language.
Get the first taste of working with Visitor design pattern.
The names of the classes must be exactly the same as specified. The same is the case for the names and types of the fields within the class, as well as the order in which they are defined and listed as the constructor arguments. This allows us to design our own Examples class that tests your program.
Make sure you follow the style guidelines that WebCAT enforces. For now the most important ones are: using spaces instead of tabs, indenting by 4 characters, following the naming conventions (data type names start with a capital letter, names of fields and methods start with a lower case letter), and having spaces before curly braces.
For the second part, make sure you include comments for all fields, all classes and interfaces, and all methods.
You will submit this assignment by the deadline using the Web-CAT submission system. You may submit as many times as you wish. Be aware of the fact that close to the deadline the WebCAT system may slow down to handle many submissions - so try to finish early.
With each homework you will also submit your log file named pair-user1-user2.txt where you replace user1 and user2 with the usernames of the two partners.
On top of both files you will have five lines of comments as follows:
// assignment 10 |
// partner1-last-name partner1-first-name |
// partner1-username |
// partner2-last-name partner2-first-name |
// partner2-username |
(In the text file you do not need the two slashes)
There will be a separate submission for each problem - it makes it easier to grade each problem, and to provide you with the feedback for each problem wou work on.
The three submissions will be organized as follows:
Submission HW10P1: The log file and the solution to the Shakespeare problem using the tester library for the test – in one .zip file
Submission HW10P2: The solution to the Shakespeare problem with the tests defined using JUnit, and with complete documentation using Javadoc documentation language in one .zip file
Submission HW10P3: The solution to the Visitor problem in one .zip file
Due Date: Thursday, March 27th, 11:59 pm.
Problem 1: Shakespeare
Introduction
Have you ever wondered about the size of Shakespeare’s vocabulary? For this assignment you will write a program that reads its input from a text file and lists the words that occur most frequently, together with a count of how many different words occur in the file.
If this program were to run on a file that contains all of Shakespeare’s works, it would tell you the approximate size of his vocabulary, and how often he uses the most common words.
Hamlet, for example, contains about 4542 distinct words, and the word king occurs 202 times, while the play Macbeth contains about 3201 distinct words, and the word macbeth occurs 288 times.
Researchers use this kind of techniques to verify authenticity of some disputed texts.
The Problem
Create a project with the name Macbeth. Download the file HW10-Words.zip unzip it and add the files to the project. Add the file Macbeth.txt to the project. Run the project, to make sure you have all pieces in place. The ExamplesWords class uses the tester package as we have done before.
The text file Macbeth.txt contains the entire text of the play Macbeth and a file StringIterator.java contains code that generates istances of the class Words from a file (e.g., Macbeth.txt) one at a time.
Save the file Macbeth.txt in the Eclipse project directory (where you find the subdirectories src and bin). The Examples class includes a code that invokes the processing of the complete text of the play Macbeth.
Note: Here you will use the imperative Iterator interface that is a part of Java Standard Library. Make sure to look up the documentation for this interface and understand how it works.
The files that are provided form a skeleton of your project - you only have to add the missing parts.
Note: You may use any Java Collections Framework classes that may help you solve this problem – and we encourage you to do so.
So, here is what you need to do:
Design the class Word that represents one word of Shakespeare’s vocabulary together with its frequency counter. The constructor takes only the String (for example the word "king") and starts the counter at 1 (one).
Two Word instances are equal to each other if they represent the same String, regardless of their frequency counters. That means that you have to override the equals() and hashCode() methods.
Implement a toString method for Word that returns the word String and its frequency.
Implement the method increment() that increments the Word’s frequency.
Design a class WordsByFreq that implements the Comparator interface, so that the words can be sorted by frequencies. (Be careful!) When you are done, place this class definition as the last part of the class definition of the class Word. This is called an inner class.
Note: In this program there will be two ways of comparing the instances of the Word class - by the String that it represents and by the frequency counter for the word that this instance represents.
Design the class WordCounter that keeps track of all the words we have seen so far. It should include the following methods:
// Record the Word objects generated by the given Iterator |
// and update the number of ocurrences |
void countWords(Iterator it) { ... } |
|
// How many different Words has this WordCounter recorded? |
int words(){ ... } |
|
// Prints the n most common words and their frequencies. |
void printWords(int n) { ... } |
Here are additional details:
countWords consumes a Word iterator that generates the words and builds the collection of the appropriate Word instances, with the correct frequencies. This collection is then used by the next two method to show the results of our text analysis.
words produces the number of different words that have been counted.
printWords consumes an integer n and prints the top n words with the highest frequencies (using the toString method defined in the class Word).
Note: The given code expects that you implement the classes as given, with the same names and methods. It will then check whether your program works correctly. That does not mean you do not need to design tests.
Testing of the Shakespeare Project
Of course, you need to test all methods as you are designing them.
Design the tests in two stages:
For the class Word and the the class WordCounter use a technique similar to what was done in the past assignments, i.e. design a class ExamplesWords with the necessary sample data and all tests. We’ve started you off... just keep going.
Problem 2: JUnit and Javadocs
Now that your program works, convert all tests into JUnit tests.
Documentation
The projects should contain Javadoc documentation that should produce the documentation pages without any warnings. You do not need to submit the documentation pages with your assignment but the WebCAT submission system will check for completeness of your documentation as follows:
It checks that the comments describe the purpose for every interface and every class.
It checks that you describe the meaning of every field defined in your program.
Foe every method it verifies not only that you have included the purpose statement, but also whether you have descirbed all arguments the method consumes, and if the method is not void, you have described the meaning of the return value.
Problem 3: Visitors
Finsh the work on the problem in Lab 10a on Visitors.