Namespaces

In order to make a SAX parser application namespace aware we have to activate two SAX parsing features:

xmlReader = saxParser.getXMLReader();
xmlReader.setFeature("http://xml.org/sax/features/namespaces", true);
xmlReader.setFeature("http://xml.org/sax/features/namespace-prefixes", true);

This instructs the parser to pass the namespace's name for each element. Namespace prefixes like xsl in <xsl:for-each> are also passed and may be used by an application:

package sax;
...
public class NamespaceEventHandler extends DefaultHandler {
...
 public void startElement(String namespaceUri, String localName,
                           String rawName, Attributes attrs) {
   System.out.println("Opening Element rawName='" + rawName + "'\n"
       + "namespaceUri='" + namespaceUri + "'\n"
       + "localName='" + localName
       + "'\n--------------------------------------------");
}

As an example we take a XSLT script:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
  xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
  xmlns:fo='http://www.w3.org/1999/XSL/Format'>

  <xsl:template match="/">
    <fo:block>A block</fo:block>
    <HTML/>
  </xsl:template>

</xsl:stylesheet>

This XSLT script being conceived as a XML document instance contains elements belonging to two different namespaces namely http://www.w3.org/1999/XSL/Transform and http://www.w3.org/1999/XSL/Format. The script also contains a raw <HTML/> element being introduced only for demonstration purposes belonging to the default namespace. The result reads:

Opening Element rawName='xsl:stylesheet'
namespaceUri='http://www.w3.org/1999/XSL/Transform'
localName='stylesheet'
--------------------------------------------
Opening Element rawName='xsl:template'
namespaceUri='http://www.w3.org/1999/XSL/Transform'
localName='template'
--------------------------------------------
Opening Element rawName='fo:block'
namespaceUri='http://www.w3.org/1999/XSL/Format'
localName='block'
--------------------------------------------
Opening Element rawName='HTML'
namespaceUri=''
localName='HTML'

Now the parser tells us to which namespace a given element node belongs to. A XSLT engine for example uses this information to build two classes of elements:

  • Elements belonging to the namespace http://www.w3.org/1999/XSL/Transform like <xsl:value-of select="..."/> have to be interpreted as instructions by the processor.

  • Elements not belonging to the namespace http://www.w3.org/1999/XSL/Transform like <html/> or <fo:block> are copied as is to the output.

exercise No. 65

Generating SQL INSERT statements from XML data Create comment in forum

Q:

Consider the following schema and document instance example:

Figure 904. A sample catalog containing products and corresponding descriptions. Create comment in forum
<xs:element name="catalog">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="product" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="product">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="name" type="xs:string"/>
      <xs:element name="description" type="xs:string" minOccurs="0"
                         maxOccurs="unbounded"/>
      <xs:element name="age" type="xs:int" minOccurs="0" maxOccurs="1"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:ID" use="required"/>
  </xs:complexType>
</xs:element>
<catalog ... xsi:noNamespaceSchemaLocation="catalog.xsd">
   <product id="mpt">
       <name>Monkey Picked Tea</name>
       <description>Rare wild Chinese tea</description>
       <description>Picked only by specially trained monkeys</description>
   </product>
    <product id="instantTent">
        <name>4-Person Instant Tent</name>
        <description>4-person, 1-room tent</description>
        <description>Pre-attached tent poles</description>
        <description>Exclusive WeatherTec system.</description>
        <age>15</age>
    </product>
</catalog>

Data being contained in catalog instances shall be transferred to a relational database system. Implement and test a SAX application by following the subsequently described steps:

Database schema

Create a database schema matching a product of your choice (Mysql, Oracle, ...). Your schema should map type and integrity constraints of the given DTD. In particular:

  • The element <age> is optional.

  • <description> elements are children of <product> elements and should thus be modeled by a 1:n relation.

  • In a catalog the order of descriptions of a given product matters. Thus your schema should allow for descriptions being ordered.

SAX Application

The order of appearance of the XML elements <product>, <name> and <age> does not permit a linear generation of suitable SQL INSERT statements by a SAX content handler. Instead you will have to keep copies of local element values when implementing org.xml.sax.ContentHandler.startElement(String,String,String,org.xml.sax.Attributes) and related callback methods. The following sequence of insert statements corresponds to the XML data being contained in Figure 904, “A sample catalog containing products and corresponding descriptions. ”. You may use these statements as a blueprint to be generated by your SAX application:

INSERT INTO Product VALUES ('mpt', 'Monkey picked tea', NULL);
INSERT INTO Description VALUES('mpt', 0,
                                'Picked only by specially trained monkeys');
INSERT INTO Description VALUES('mpt', 1, 'Rare wild Chinese tea');

INSERT INTO Product VALUES ('instantTent', '4-person instant tent', 15);
INSERT INTO Description VALUES('instantTent', 0, 'Exclusive WeatherTec system.');
INSERT INTO Description VALUES('instantTent', 1, '4-person, 1-room tent');
INSERT INTO Description VALUES('instantTent', 2, 'Pre-attached tent poles');

Provide a suitable Junit test.

A:

Running this project and executing tests requires the following Maven project dependency to be installed (e.g. locally via mvn install) to satisfy a dependency:

Some remarks are in order here:

  1. The SQL database schema might read:

    CREATE TABLE Product (
       id CHAR(20) NOT NULL PRIMARY KEY 
      ,name VARCHAR(255) NOT NULL
      ,age SMALLINT 
    );
    
    CREATE TABLE Description (
       product CHAR(20) NOT NULL REFERENCES Product 
      ,orderIndex int NOT NULL   -- preserving the order of descriptions
                                   -- belonging to a given product
      ,text VARCHAR(255) NOT NULL
      ,UNIQUE(product, orderIndex) 
    );

    The primary key constraint implements the uniqueness of <product id='xyz'> values

    Nullability of age implements <age> elements being optional.

    <description> elements being children of <product> are being implemented by a foreign key to its identifying owner thus forming weak entities.

    The attribute orderIndex allows descriptions to be sorted thus maintaining the original order of appearance of <description> elements.

    The orderIndex attribute is unique within the set of descriptions belonging to the same product.

  2. The result of the given input XML sample file should be similar to the content of the supplied reference file products.reference.xml:

    INSERT INTO Product (id, name) VALUES ('mpt', 'Monkey Picked Tea');
    INSERT INTO Description VALUES('mpt', 0, 'Rare wild Chinese tea');
    INSERT INTO Description VALUES('mpt', 1,
                                   'Picked only by specially trained monkeys');
    -- end of current product entry --
    
    INSERT INTO Product VALUES ('instantTent', '4-Person Instant Tent', 15);
    INSERT INTO Description VALUES('instantTent', 0, '4-person, 1-room tent');
    INSERT INTO Description VALUES('instantTent', 1, 'Pre-attached tent poles');
    INSERT INTO Description VALUES('instantTent', 2, 'Exclusive WeatherTec system.');
    -- end of current product entry --

    So a Junit test may just execute the XML to SQL converter and then compare the effective output to the above reference file.

exercise No. 66

Counting element names grouped by namespaces Create comment in forum

Q:

We want to extend the SAX examples counting elements and of arbitrary document instances. Consider the following XSL sample document containing XHTML:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" ❶
    xmlns:h="http://www.w3.org/1999/xhtml" ❷
    exclude-result-prefixes="xs" version="2.0">

    <xsl:template match="/">
        <h:html>
            <h:head>
                <h:title></h:title>
            </h:head>
            <h:body>
                <h:h1>A heading</h:h1>
                <h:p>A paragraph</h:p>
                <h:h1>Yet another heading</h:h1>
                <xsl:apply-templates/>
            </h:body>
        </h:html>
    </xsl:template>

    <xsl:template match="*">
        <xsl:message>
            <xsl:text>No template defined for element '</xsl:text>
            <xsl:value-of select="name(.)"/>
            <xsl:text>'</xsl:text>
        </xsl:message>
    </xsl:template>

</xsl:stylesheet>

This XSL stylesheet defines two different namespaces ❶ and ❷.

Implement a SAX application being able to group elements from arbitrary XML documents by namespaces along with their corresponding frequencies of occurrence. The intended output for the previous XSL example shall look like:

Namespace 'http://www.w3.org/1999/xhtml' contains:
<head> (1 occurrence)
<p> (1 occurrence)
<h1> (2 occurrences)
<html> (1 occurrence)
<title> (1 occurrence)
<body> (1 occurrence)

Namespace 'http://www.w3.org/1999/XSL/Transform' contains:
<stylesheet> (1 occurrence)
<template> (2 occurrences)
<value-of> (1 occurrence)
<apply-templates> (1 occurrence)
<text> (2 occurrences)
<message> (1 occurrence)

Hint: Counting frequencies and grouping by namespaces may be achieved by using standard Java container implementations of java.util.Map. You may for example define sets of related XML elements and group them by their corresponding namespaces. Thus nested maps are being required.

A:

Running this project and executing tests requires the following Maven project dependency to be installed (e.g. locally via mvn install) to satisfy the following dependency:

The above solution contains both a running application and a (incomplete) Junit test.