Splitting documents into chunks

Sometimes we want to generate multiple output documents from a single XML source. It may for example be a bad idea to transform a book of 200 printed pages into a single online HTML page. Instead we may split each chapter into a separate HTML file and create navigation links between them.

We consider a memo document instance. We want to generate one text file for each memo recipient containing just the recipient's name using the XSL element <xsl:result-document>:

<xsl:template match="/memo">
  <xsl:apply-templates select="to"/>
</xsl:template>

<xsl:template match="to">
  <xsl:result-document
                  href="file_{position()}.txt"
                  method="text"
                  ❸>
    <xsl:value-of select="."/> </xsl:result-document>
</xsl:template>

The output from all generating XSL directives will be redirected from standard output to another output channel.

The output will be written to a file named file_i.txt with decimal number i ranging from value 1 up to the number of recipients.

The method attribute possibly overrides a value being given in the <xsl:output> element. We may also redefine other attributes from <xsl:output> like doctype-{public.system} and the generated file's encoding.

All output being generated in this region gets redirected to the channel specified in .

exercise No. 58

Splitting book into chapter files Create comment in forum

Q:

Extend your solution of Extending the memo style sheet by mixed content and itemized lists by writing each <chapter>'s content into a separate Xhtml file. In addition create a file index.html which contains references to the corresponding <chapter> documents. Thus for a document instance with two chapters the overall navigation structure is illustrated by Figure 979, “A <book> document with two chapters ”.

Implementing the <link> tag may cause a problem: An internal link may reference a <para>. You need to identify the <chapter> node embedding this para. This may be done by using a suitable XPath axis direction.

A:

The full source code of the solution is available at (Online HTML version) ... book2chunks.1.xsl. First we generate the table of contents file index.html:

<xsl:template match="/">
  <xsl:result-document href="index.html">
    <xsl:apply-templates select="book"/>
  </xsl:result-document>

  <xsl:for-each select="book/chapter">
    <xsl:result-document href="{generate-id(.)}.html">
      <xsl:apply-templates select="."/>
    </xsl:result-document>
  </xsl:for-each>
</xsl:template>

<xsl:template match="book">
  <html>
    <head><title><xsl:value-of select="title"/></title></head>
    <body>
      <h1><xsl:value-of select="title"/></h1>
      <h2>Table of contents</h2>
      <ul>
        <xsl:for-each select="chapter">
          <li><a href="{generate-id(.)}.html"><xsl:value-of select="title"/></a></li>
        </xsl:for-each>
      </ul>
    </body>
  </html>
</xsl:template>

The <link ref="..."> may reference a <chapter> or a <para>. So we may need to step up from a paragraph to the corresponding chapter node:

<xsl:template match="link">
  <xsl:variable name="reftargetNode" select="id(@linkend)"/>
  <xsl:variable name="reftargetParentChapter"
    select="$reftargetNode/ancestor-or-self::chapter"/>

  <a href="{generate-id($reftargetParentChapter)}.html#{
    generate-id($reftargetNode)}">
    <xsl:value-of select="."/>
  </a>
</xsl:template>

This is consistent since all <p> nodes in the generated Xhtml receive a unique id value regardless whether the originating <para> node does have one.

Figure 979. A <book> document with two chapters Create comment in forum