Exercises

exercise No. 192

Adding line numbers to text files

Q:

We want to add line numbers to arbitrary text files not necessarily being related to programming. Consider the following HTML example input:

<html>
    <head>
        <title>A simple HTML example</title>
    </head>
    <body>
        <p>Some text ... </p>
    </body>
</html>

Your application shall add line numbers:

1: <html>
2:     <head>
3:         <title>A simple HTML example</title>
4:     </head>
5:     <body>
6:         <p>Some text ... </p>
7:     </body>
8: </html>

Hints:

  1. Given the name of an existing file you may create an instance of BufferedReader:

    final FileReader fileReader = new FileReader(inputFileName);
    final BufferedReader inputBufferedReader = new BufferedReader(fileReader);
  2. You will have to deal with possible FileNotFoundException problems providing meaningful error messages.

  3. The BufferedReader class provides a method readLine() allowing to access a given file's content line by line.

    Caution

    Even if a file exists you have my encounter IOException problems being related to i.e. missing permissions.

A:

This solutions reacts both to inexistent files and general IO problems:

File not found: Testdata/input.java

Two test cases deal both with readable and non-existing files: and expected exceptions:

@Test
public void testReadFileOk() throws FileNotFoundException, IOException {
  ReadFile.openStream("Testdata/input.txt"); // Existing file
}
@Test (expected=FileNotFoundException.class) // We expect this exception to be
                                             // thrown.
public void testReadMissingFile() throws FileNotFoundException, IOException {
  ReadFile.openStream("Testdata/input.java"); // Does not exist
}

Notice the second test which will only succeed if a FileNotFoundException is being thrown.

exercise No. 193

A partial implementation of GNU UNIX wc

Q:

In this exercise we will partly implement the (Gnu) UNIX command line tool wc (word count). Prior to starting this exercise you may want to:

  • Execute wc for sample text files like e.g. a Java source file of similar:

    goik >wc BoundedIntegerStore.java
      58  198 1341 BoundedIntegerStore.java
    

    What do these three numbers 58, 198 and 1341 mean? Execute wc --help or man wc or read the HTML documentation.

  • wc may process several file in parallel thereby producing an extra line ❶ summing up all values:

    goik >wc bibliography.xml swd1.xml
        69     83   2087 bibliography.xml
      6809  18252 248894 swd1.xml
      6878  18335 250981 total 
  • wc can be used in pipes () like:

    goik >grep int BoundedIntegerStore.java | wc
         12      76     516

    The above output 12 76 516 tells us that our file BoundedIntegerStore.java does have 12 lines containing the string int.

A partial implementation shall offer all features being mentioned in the introduction. The following steps are a proposal for your implementation:

  1. Write a method counting the number of words within a given string. We assume words to be separated by at least one white space character (space or \t). Write some tests to assure correct behaviour.

  2. Read input either from a list of files or from standard input depending on the number of arguments to main(String[] args):

    • If args.length == 0 assume to read from standard input.

    • if 0 < args.length try to interpret the arguments as filenames.

  3. Write a class TextFileStatistics being able to and count characters, words and lines of a single input file. Instances of this class may be initialized from a BufferedReader.

    Write corresponding tests.

  4. You may create an instance of BufferedReader from System.in via:

    new BufferedReader(new InputStreamReader(System.in))
  5. Create an executable Jar archive and execute some examples. The UNIX command cat writes a file's content to standard output. This output may be piped as input to your application as in cat filename.txt | java -jar .../wc-1.0.jar.

A:

Executing mvn package creates an executable Jar file ../target/wc-1.0.jar. We test both ways of operation:

Reading from standard input
goik >cat Testdata/input.html | java -jar target/wc-1.0.jar
  9    14    137
Passing file names as parameters
goik >java -jar target/wc-1.0.jar Testdata/*
  9    14    137  Testdata/input.html
  4     5     41  Testdata/model.css
 13    19    178  total

Junit tests of internal functionality:

Counting words in a given string:
@Test
public void testNoWord() {
  Assert.assertEquals("Just white space", 0,
       TextFileStatistics.findNoOfWords(" \t"));
}

@Test
public void testSingleWord() {
  final String s = "We're";
  Assert.assertEquals("text='" + s + "'", 1,
       TextFileStatistics.findNoOfWords(s));
}

@Test
public void testTwoWords() {
  final String s = "We are";
  Assert.assertEquals("text='" + s + "'", 2,
       TextFileStatistics.findNoOfWords(s));
}

@Test
public void testWordsWhiteHead() {
  final String s = "\t \tBegin_space";
  Assert.assertEquals("text='" + s + "'", 1,
       TextFileStatistics.findNoOfWords(s));
}

@Test
public void testWordsWhiteTail() {
  final String s = "End_space \t ";
  Assert.assertEquals("text='" + s + "'", 1,
       TextFileStatistics.findNoOfWords(s));
}

@Test
public void testWhiteMulti() {
  final String s = "    some\t\tinterspersed   \t  spaces \t\t ";
  Assert.assertEquals("text='" + s + "'", 3,
        TextFileStatistics.findNoOfWords(s));
}
Analyzing test file data:
@Test
public void testTwoInputFiles() throws FileNotFoundException, IOException {

  final String model_css_filename =
    "Testdata/model.css",      //  4 lines   5  words  41 character
      input_html_filename =
    "Testdata/input.html";     //  9 lines  14  words 137 character
                               //_________________________________________
                               // total 13 lines  19  words 178 character

  final TextFileStatistics
    model_css = new TextFileStatistics(
      new BufferedReader(new FileReader(model_css_filename)),
           model_css_filename),

    input_html = new TextFileStatistics(new BufferedReader(
        new FileReader(input_html_filename)), input_html_filename);

  // File Testdata/model.css
  Assert.assertEquals( 4, model_css.numLines);
  Assert.assertEquals( 5, model_css.numWords);
  Assert.assertEquals(41, model_css.numCharacters);

  // File Testdata/input.html
  Assert.assertEquals(  9, input_html.numLines);
  Assert.assertEquals( 14, input_html.numWords);
  Assert.assertEquals(137, input_html.numCharacters);

  // Grand total
  Assert.assertEquals( 13, TextFileStatistics.getTotalNumLines());
  Assert.assertEquals( 19, TextFileStatistics.getTotalNumWords());
  Assert.assertEquals(178, TextFileStatistics.getTotalNumCharacters());
}