Doc Book

DocBook is a StandardGeneralizedMarkupLanguage (and now ExtensibleMarkupLanguage) DocumentTypeDefinition (DTD) for writing technical books and articles. It is very mature, having been developed for over ten years. It has a mature tool-chain that can be used to translate DocBook documents into other formats for type setting or online presentation, including TeX, PDF, PostScript, HTML, Microsoft HTML Help, and TROFF. O'Reilly use DocBook to author their books and it is the standard documentation format for several computer vendors and for various Linux and OpenSource projects.

If you know XML or HTML, DocBook is easy to learn. Since it was designed to handle large documentation projects, it feels more logical and is more carefully designed than HTML. However, it is much larger, more complicated, and more narrowly focused than HTML: the DTD defines hundreds of markup elements for specifying the structure of books and technical computer documentation. It is designed to mark up text, hyperlinks and embedded media (like HTML) but can also be used to generate table of contents and indexes.

There is a Simplified DocBook DTD that is useful for online documents, and has a reduced number of elements.

DocBook has a long and varied history (it started as SGML DTD and is now an XML DTD) and has been used by many different companies (from O'Reilly to Boeing to the LinuxDocumentationProject to the Pentagon) to create many different types of output (PostScript, PDF, RichTextFormat, plain old Text, HTML, and PalmPilot doc files).


For DocBook see

SourceForge is hosting the DocBook project and there's an updated version of the O'Reilly Book "DocBookTheDefinitiveGuide" on-line at

SourceForge is also hosting a Online DocBook Validator and XML -> HTML/PDF Transformer @

There is also a DocBook Wiki at:

Since DocBook specifies the structural format of a document, you need to use a stylesheet to transform documents to the output you desire (HTML webpages, PDF, and TeX). DSSSL, XSLT, and CSS stylesheets are available on the DocBook Web site (, or you can write your own.

Testimonials & Questions

"I've started using DocBook to write articles and a book, and I've found it very good so far."

DocBook is a document language like LaTex. But in that case why not stick with LaTex? What are the advantages of yet another format?

If you know TeX, then there is probably not much advantage in learning DocBook. The main benefit of DocBook has over TeX, so I've been told by someone who has a lot of experience in both, is that DocBook is easier to translate into other formats because documents are data files rather than executable programs that control a type setter.

The DocBook Processing Path

A DocBook document is a collection of SGML or XML files. These files must conform to an external DTD (i.e.: they must be validated by a parser). The most common parser is Jade; however, xsltproc (an XSLT processor) can also be used to validate XML files.

After the document is parsed and found to be valid, a stylesheet (examples: DSSSL, XSLT, CSS) is applied and the document is transformed to another format (XHTML, TeX, XSL-FO). In some cases (XSL-FO to PDF), further transformation is required.

If you're still confused by the DocBook processing path, the following image may help.

See LaTex, ExtensibleMarkupLanguage, DocBookTheDefinitiveGuide

EditText of this page (last edited February 22, 2005) or FindPage with title or text search