§ 2.4 Site processing

A more complicated situation combines several  sources into a single interlinked site consisting of multiple pages and a composite index and bibliography.

Conversion

First, all  sources must be converted to xml, using latexml. Since every target-able element in all files to be combined must have a unique identifier, it is useful to prefix each identifier with a unique value for each file. The latexml option --documentid=id provides this.

Scanning

Secondly, all xml files must be split and scanned using the command

latexmlpost --prescan --dbfile=DB --dest=i.html i

where DB names a file in which to store the scanned data. Other conversions, including writing the output file, are skipped in this prescanning step.

Pagination

Finally, all xml files are cross-referenced and converted into the final format using the command

latexmlpost --noscan --dbfile=DB --dest=i.html i

which skips the unnecessary scanning step.

For example, consider a set of nominally stand-alone  documents: main (with title page, \tableofcontents, etc), A (with a chapter), Aa (with a section), B (with a chapter), …and bib (with a \bibliography). Assume that the documents use \lxDocumentID from \usepackage{latexml} to declare ids main, main.A, \main.A.a, main.B, …bib, respectively. And, of course, you’ll have to arrange for appropriate counters to be initialized appropriately, if needed.

Now, processing the documents with the following commands

# Conversion
latexml --dest=main.xml main.tex
latexml --dest=A.xml A
latexml --dest=Aa.xml Aa
latexml --dest=B.xml B
                      
latexml --dest=bib.xml bib
# Scan
latexmlpost --prescan --db=my.db --dest=/site/main.html main
latexmlpost --prescan --db=my.db --dest=/site/A.html A
latexmlpost --prescan --db=my.db --dest=/site/Aa.html Aa
latexmlpost --prescan --db=my.db --dest=/site/B.html B
                      
latexmlpost --prescan --db=my.db --dest=bib.html bib
# Pagination
latexmlpost --noscan --db=my.db --dest=/site/main.html main
latexmlpost --noscan --db=my.db --dest=/site/A.html A
latexmlpost --noscan --db=my.db --dest=/site/Aa.html Aa
latexmlpost --noscan --db=my.db --dest=/site/B.html B
                      
latexmlpost --noscan --db=my.db --dest=bib.html bib

This will result in a site built at /site/, with the following implied structure:

main.html
  A.html
    Aa.html
  B.html
    ...
  bib.html