Result string ordering

exercise No. 201

Words and corresponding frequencies

Q:

So far we've extracted the set of words a given text consists of. In addition we'd like to see their corresponding frequencies of appearance as well. This frequency value shall be used as primary sorting criterion with respect to report output. Consider the following example text:

One day, Einstein, Newton, and Pascal meet up
and decide to play a game of hide and seek.
Einstein volunteered to be "It". As Einstein
counted, eyes closed, to 100, Pascal ran away
and hid, but Newton stood right in front of
Einstein and drew a one meter by one meter
square on the floor around himself. When
Einstein opened his eyes, he immediately saw
Newton and said "I found you Newton", but Newton
replied, "No, you found one Newton per square meter.
You found Pascal!"

Ignoring special characters the following result shall be created:

  6: Newton
  6: and
  5: Einstein
  3: Pascal
  3: found
  3: meter
  3: one
  3: to
  2: a
...

The first line tells us that the word Newton appears six times in the analyzed document.

Hints:

  1. Define a class WordFrequency containing a String attribute among with an integer number representing its frequency of appearance:

    /**
     * A helper class to account for frequencies of words found in textual input.
     *
     */
    public class WordFrequency {
      /**
       * The frequency of this word will be counted.
       */
      public final String word;
      private int frequency;
     ...
    }

    Two instances of WordFrequency shall be equal if and only if their word attribute values are equal regardless of their frequency values. In slightly other words: With respect to equality instances of WordFrequency inherit equality solely from their contained word values irrespective of any frequency value.

    Override equals(...) and hashValue() accordingly.

  2. Create a List<WordFrequency> (Not a Set<WordFrequency>!) holding words being found in your input texts among with their frequencies of appearance.

    Whenever the next input word is being processed follow the subsequent procedure:

    1. Create a corresponding instance of WordFrequency from it having initial frequency 1.

    2. Test whether an instance being equal has already been added to your List<WordFrequency> instance leaving you with two choices:

      The current word already exists:

      Lookup the entry and increment its frequency by one.

      The current word is new:

      Add the previously created WordFrequency instance to your List<WordFrequency>.

    3. After processing the input text file sort your List<WordFrequency> by a suitable Comparator<WordFrequency> instance by means of Collections.sort(...).

A: