Language independent specification

XML documents allow for automated content processing. The SAX API allows for accessing XML documents by Java applications in an event based fashion. There are however situations where SAX is not appropriate:

Figure 698. SAX deficiencies Slide presentation Create comment in forum
  • Event based model lacking context. Requires writing of content assembly related code.

  • No XPath support.

  • No subtree movement within or between documents.

  • In a word: No in-memory document representation.

    Consequence: No tree navigation.


Figure 699. DOM: Language independence Slide presentation Create comment in forum
  • DOM objects and operations being defined using CORBA 2.2 Interface Definition Language (IDL)

  • Per-language binding, e.g. a set of interfaces. Examples:

    • A set of Java interfaces.

    • A set of C++ pure virtual classes.


Figure 700. DOM: Vendor independence Slide presentation Create comment in forum

Figure 701. DOM Node CORBA 2.2 IDL Slide presentation Create comment in forum
interface Node {
  const unsigned short ELEMENT_NODE   = 1; // NodeType
  const unsigned short ATTRIBUTE_NODE = 2;
  const unsigned short TEXT_NODE      = 3;
   ...
  readonly attribute DOMString      nodeName;
  attribute DOMString nodeValue;

  readonly attribute unsigned short nodeType;
  readonly attribute Node           parentNode;
   ...
  readonly attribute NodeList       childNodes;
  readonly attribute Node           firstChild;
   ...
  Node insertBefore(in Node newChild, in Node refChild)
                                  raises(DOMException);
   ...

Figure 702. Defining a language binding Slide presentation Create comment in forum
  • Using a given language's constructs closely resembling the CORBA 2.2 IDL specification.

  • Difficult for non-OO languages.


Figure 703. org.w3c.dom.Node Java binding. Slide presentation Create comment in forum
package org.w3c.dom;

public interface Node {            // Node Types
   public static final short ELEMENT_NODE   = 1;
   public static final short ATTRIBUTE_NODE = 2;
   public static final short TEXT_NODE      = 3;
      ...
   public String   getNodeName();
   public String   getNodeValue() throws DOMException;
   public void     setNodeValue(String nodeValue) throws DOMException;
   public short    getNodeType();
   public Node     getParentNode();
   public NodeList getChildNodes();
   public Node     getFirstChild();
   ...
   public Node     insertBefore(Node newChild, Node refChild)
                                          throws DOMException;
   ...

We take org.w3c.dom.Node.getChildNodes() as an example:

Figure 704. A context node's children Slide presentation Create comment in forum
A context node's children

Figure 705. org.w3c.dom.Node subtypes Slide presentation Create comment in forum
  • Element

  • Text

  • Comment

  • Processing instruction: <?xml-stylesheet type="text/xsl" href="style.xsl"?>.

  • Entity

  • ...


Figure 706. DOM Java binding inheritance interface hierarchy Slide presentation Create comment in forum
DOM Java™ binding inheritance interface hierarchy

Current Java distributions do contain a DOM implementation including parsers, XPath engines etc. .

The DOM's specification defines a (still growing) set of modules. An implementation may not implement all of these:

Figure 707. DOM modules. Slide presentation Create comment in forum
DOM modules.

Figure 708. Jdom vs. DOM: Advantages Slide presentation Create comment in forum

Figure 709. Jdom vs. DOM: Disadvantages Slide presentation Create comment in forum
  • Set apart from the standard.

  • May lack advanced features.

  • Smaller user community, less mature.

  • Potential 3-rd party DOM framework incompatibilities.