Expand | Collapse

◀ 19. Numbering21. Math formulas in TEX

20. Site maps

A site map tells xhtml2to1 how to link the separate sections in the user's document, which are actually split into several XML documents, into one coherently-presented document in the Web browser.

The site map contains the information need to automatcally:

  • create links in the navigation menu to other parts of the document
  • resolve labels for cross references
  • number each section of the document in a hierarchical sequence
  • create indices to marked terms occurring in the document

A site map is not necessary for processing single-file documents, but is required for multiple-file documents.

20.1. How to write site maps

In this section, we explain how to use site maps in xhtml2to1.

20.1.1. Specifying navigation menus

Next, we also want xhtml2to1 to create navigation menus for us. Since each section in the document are separate XML documents, they need to be explicitly linked together. We tell xhtml2to1 what goes in the navigation menu using XHTML 2.0’s nl element (meaning navigation list).

Each entry in the navigation list is introduced by the familiar li element. The nl navigation lists may be nested, to create hierarchical1 menus.

20.2. Design

In very general terms, a site map is an index of all the relevant documents of a site and all the information needed for navigating through those documents. The information is needed to provide the navigation menu for the individual pages, to provide section numbering and to provide cross-reference labels for links between two documents.

There are many possible designs for what a site map should be, and how the information inside it should be organized. For example, one could use a content management system, backed by a database, to keep track of the site map. In xhtml2to1, the site map uses a fairly simple design that can be efficiently implemented with XSLT.

We insist on the design being simple to minimize the inertia of the potential writer to write technical documentation. Thus there will necessarily be some trade-offs, such that certain advanced modes of navigation cannot be expressed in the current design. If those features are desired, then the only recourse is to invent a new kind of site map scheme; fortunately, xhtml2to1 is modular enough to allow this. We will have more to say about this topic later.

20.2.1. Structure of the site map

Formally, a site map is an hierarchical list (i.e. a tree), where each node represents a sub-section of the document. At each node, we record:

  1. The XML file that the sub-section appears in, if the sub-section is the root of that XML file.
  2. The one-to-one mapping of the URI for the XML file to the URI for the generated HTML file.
  3. The cross-reference label for the sub-section.
  4. The title of the sub-section.
  5. The numbering class of the sub-section.

The hierarchical list is encoded using nested xs:site-item elements corresponding to each node of the tree.

The information listed above will be entered, inside one of the following appropriate elemental wrappers, that are the children of the xs:site-item they apply to.

  1. xs:location
  2. xs:label
  3. xs:title
  4. xs:number

In its early stages, xhtml2to1 encoded the site map as XHTML2 navigation lists (x2:nl and x2:li) directly, but the author found that it complicates the site map code enormously, so that approach was abandoned.

20.3. Incremental builds

Despite the colorful criticism we made of large monolithic DocBook documents, we must not get too optimistic on how far incremental building of large documents can be pushed. If a change is made to a sub-document that affects all the other documents, such as changing a cross-reference label, or adding a new entry to the navigation menu, then all the sub-documents may still have to be generated.

But we can still obtain some benefits:

  1. Even if all the sub-documents have to be processed again at a certain stage, at least they are processed sequentially in small chunks. It is not necessary to load a five-megabyte XML document tree into memory all at once.
  2. Localized changes do not force all sub-documents to be processed again.
  3. To save on build time during development, the writer may allow some of the output files to be temporarily out-of-date (which only affects a small number of items on the navigation menu, and a few cross-reference labels). Only at the final stage does the writer do a full build of the documentation.
  4. If the HTML output is dynamically generated from either the server side or the client side, then the sub-documents do not need to be generated all at once. For writing a book-size document, this saves a lot of time, as usually one works on only a few chapters at a time.


  1. Hierarchical menus are not always the best way to organize a document. But they are basic and familiar.

Formatted using xhtml2to1 by Steve Cheng.