## Collections III, Exercises

No. 200

Q:

We want to transform address data to different destination formats. Consider the following text file data source:

"firstName","lastName","companyName","address","city", ...
"Aleshia","Tomkiewicz","Alan D Rosenburg Cpa Pc","14 Tay ...
...

This excerpt exists as file named addresses.txt in the following Maven project:

• Maven module source code available at P/Sd1/HtmlFormatting/Simple/Exercise.

Import the above project into Eclipse. Executing de.hdm_stuttgart.mi.sd1.htmlformat.Address2text yields the following output:

"firstName","lastName","companyName","address","city", ...
++++++++++++++++++++++
Name:Tim Dummy
Company:Dummy Company
Phone:1234567, 7654321
E-Mail:@dummy@dummy.com
--------------------

...

--------------------
End of records

This result neither uses the input data supplied by addresses.txt nor does it produce HTML output yet. You have to complete the implementation by following the subsequent steps:

1. Try to understand the current project. Its classes have the following general purposes:

Address

Holding components like first name, last name, telephone numbers, email and so on of an individual address.

Address2text

The main application. This class assembles other classes, opens the address data source and starts the formatting process.

Address2textFormatter

This class formats individual address records.

AddressDataHandler

Opening the data source and creating a Java in memory representation of the whole set. This is necessary since the output sorting order may be altered.

AddressFormatter

This interface specifies three methods being called during output formatting.

AddressParseError

Instances of this exception will be thrown whenever the input file contains errors. Consider the following example:

"Laquita","Hisaw,"In Communications Inc","20 Gloucester Pl #96",

In this example we have no quote after Hisaw. This should yield a parsing error.

2. The constructor Address(...) does not yet parse address records but creates constant dummy data instead. Use the parameter csvRecord to actually initialize the desired address fields firstName, lastName, ..., web. Hint: You may use the split(...) method:

... = s.split("\",\"");

This splits an input record into its address components. The first component will however start with a quotation mark like "Aleshia and the last component will have a trailing " like in http://www.lbt.co.uk". You may use the substring(...) method to get rid of them.

3. You must exclude the header line (= first line of addresses.txt) of your data source from result generation:

"firstName","lastName","companyName","address","city","county","postal","phone1","phone2","email","web"
4. Think of all syntax rules of your input data source addresses.txt and throw AddressParseError appropriate exceptions. Write test cases checking for correct parsing error detection in input files.

5. The current project produces text output. In order to generate HTML you have to replace Address2textFormatter by a new class Address2htmlFormatter which implements the interface AddressFormatter as well.

You may then exchange your formatter in Address2text.main():

final AddressFormatter htmlFormatter = new Address2htmlFormatter();
6. Address fields may contain the characters <, > and &. These will interfere with generated HTML markup. There are two possible solutions:

CDATA sections (preferred):

Wrap your output in <![CDATA[ ... ]]> sections if required. Strictly speaking this is only specified for XHTML variants but most browsers will accept it anyway.

Using HTML replacement entities
 & & < < > >

This requires textually replacing special characters by the above entities e.g. by means of String.replace(...).

7. Since you do generate HTML output renaming your class Address2text to Address2html is a good idea. Your output might look like:

<html xmlns='http://www.w3.org/1999/xhtml'>
<body>
<table border='1'>
<colgroup style='width: 20%'/>
<colgroup style='width: 30%'/>
<colgroup style='width: 25%'/>
<colgroup style='width: 25%'/>
<tr>
<th>Name</th>
<th>Phone</th>
<th>E-Mail</th>
</tr>
...
<tr>
<td>Graham <b>Stanwick</b></td>
<td><![CDATA[73 Hawkstone St, Renfrew South & Gallowhill War]]>,
<b>G52 4YG</b></td>
<td>01860-191930</td>
<td>gstanwick@gmail.com</td>
</tr>
<tr>
...

</tr>
</table>
</body>
</html>

As you can see CDATA sections are only used if embedded data does contain <, > or & characters.

8. You may direct your generated HTML output to a file rather than to the standard output System.out. This can be achieved by opening an output PrintStream related to a file by means of the PrintStream output filename constructor. Your resulting output may transform the file addresses.txt into addresses.txt.xhtml. The latter should be rendered like:

A:

• Maven module source code available at P/Sd1/HtmlFormatting/Simple/Solution.