Chapter 1 Introduction

For many,  is the preferred format for document authoring, particularly those involving significant mathematical content and where quality typesetting is desired. On the other hand, content-oriented xml is an extremely useful representation for documents, allowing them to be used, and reused, for a variety of purposes, not least, presentation on the Web. Yet, the style and intent of  markup, as compared to xml markup, not to mention its programmability, presents difficulties in converting documents from the former format to the latter. Perhaps ironically, these difficulties can be particularly large for mathematical material, where there is a tendency for the markup to focus on appearance rather than meaning.

The choice of  for authoring, and xml for delivery were natural and uncontroversial choices for the Digital Library of Mathematical Functions. Faced with the need to perform this conversion and the lack of suitable tools to perform it, the DLMF project proceeded to develop their own tool, , for this purpose.

Design Goals

The idealistic goals are:

  • Faithful emulation of ’s behaviour;

  • Easily extensible;

  • Lossless, preserving both semantic and presentation cues;

  • Use an abstract -like, extensible, document type;

  • Infer the semantics of mathematical content
    (Good Presentation MathML, eventually Content MathML and OpenMath).

As these goals are not entirely practical, even somewhat contradictory, they are implicitly modified by as much as possible. Completely mimicing ’s, and ’s, behaviour would seem to require the sneakiest modifications to , itself; redefining ’s internals does not really guarantee compatibility. “Ease of use” is, of course, in the eye of the beholder; this manual is an attempt to make it easier! More significantly, few documents are likely to have completely unambiguous mathematics markup; human understanding of both the topic and the surrounding text is needed to properly interpret any particular fragment. Thus, while we’ll try to provide a “turn-key” solution that does the ‘Right Thing’ automatically, we expect that applications requiring high semantic content will require document-specific declarations and tuning to achieve the desired result. Towards this end, we provide a variety of means to customize the processing and declare the author’s intent. At the same time, especially for new documents, we encourage a more logical, content-oriented markup style, over a purely presentation-oriented style.

Overview of this Manual

Chapter 2 describes the usage of , along with common use cases and techniques. Chapter 3 describes the system architecture in some detail. Strategies for customization and implementation of new packages is described in Chapter 4. The special considerations for mathematics, including details of representation and how to improve the conversion, are covered in Chapter 5. Several specialized topics are covered in the remaining chapters. An overview of outstanding issues and planned future improvements are given in Chapter 9.

Finally, the Appendices give detailed documentation the system components: Appendix A describes the command-line programs provided by the system; Appendix B lists the  style packages for which we’ve provided -specific bindings. Appendix C describes the various Perl modules, in groups, that comprise the system. Appendix D describes the xml schema used by . Appendix E gives an overview of the warning and error messages that  may generate. Appendix F describes the strategy and naming conventions used for CSS styling of the resulting html.

Using , and programming for it, can be somewhat confusing as one is dealing with several languages not normally combined, often within the same file, — Perl,  and xml (along with xslt, html, css), plus the occasional shell programmming. To help visually distinguish different contexts in this manual we will put ‘programming’ oriented material (Perl, ) in a typewriter font, like this; xml material will be put in a sans-serif face like this.


If you encounter difficulties, join the mailing list at latexml-project. Bugs and enhancement requests can be reported at Github. If all else fails, please consult the source code, or the author.

Danger! When you see this sign, be warned that the material presented is somewhat advanced and may not make much sense until you have dabbled quite a bit in ’s internals. Such advanced or ‘dangerous’ material will be presented like this paragraph to make it easier to skip over.