Wednesday, May 18, 2011

xmlgen: a feature-rich and high-performance XML generation library

I’ve released xmlgen to hackage just a few days ago. Xmlgen is a pure Haskell library with a convenient API for generating XML documents. It provides support for all functionality defined by the XML information set and offers good performance and low memory usage. In our company, we developed xmlgen because we wanted the readability of XML literals (as for example provided by the hsp library) without the drawbacks of a custom preprocessor (wrong line numbers in error messages, non-compositionality).

In this blog article, I’ll show you how to use the combinators provided by xmlgen to generate the following XML document:

<?xml version="1.0"?>
<people>
  <person age="32">Stefan</person>
  <person age="4">Judith</person>
</people>

First, we import some modules:

> import Data.Monoid
> import qualified Data.ByteString.Lazy as BSL
> import Text.XML.Generator -- the main xmlgen module

Then we continue by generating the person element.

> genPersonElem :: (String, String) -> Xml Elem
> genPersonElem (name, age) =
>     xelem "person" $ xattr "age" age <#> xtext name

The xelem combinator constructs an XML element from an element name and from the children of the element. Xmlgen provides overloaded variants of xelem to support a uniform syntax for the construction of elements with qualified and unqualified names and with different kinds of children. The <#> combinator separates the element’s attributes from the other children (sub-elements and text nodes). The combinators xattr and xtext construct XML attributes and text nodes, respectively.

The result of an application of xelem has type Xml Elem, whereas xattr has result type Xml Attr. This distinction is important so that attributes and elements can not be confused. The result type of the xtext combinator is Xml Elem; we decided against an extra type for text nodes because for xmlgen’s purpose text nodes and elements are almost interchangeble.

The types Xml Elem and Xml Attr are both instances of the Monoid type class. Constructing a list of elements from a list of persons and their ages is thus quite easy:

> genPersonElems :: [(String, String)] -> Xml Elem
> genPersonElems = foldr mappend mempty . map genPersonElem

The pattern shown above is quite common, so xmlgen allows the following shorthand notation using the xelems combinator.

> genPersonElems' :: [(String, String)] -> Xml Elem
> genPersonElems' = xelems . map genPersonElem

We are now ready to construct the final XML document:

> genXml :: Xml Doc
> genXml = let people = [("Stefan", "32"), ("Judith", "4")]
>          in doc defaultDocInfo $ xelem "people" (genPersonElems people)

For convenience, here is a standalone variant of the genXml function:

> genXml' :: Xml Doc
> genXml' =
>   let people = [("Stefan", "32"), ("Judith", "4")]
>   in doc defaultDocInfo $
>        xelem "people" $
>          xelems $ map (\(name, age) -> xelem "person" (xattr "age" age <#> xtext name)) people

Xmlgen supports various output formats through the overloaded xrender function. Here we render to resulting XML document as a lazy byte string:

> outputXml :: IO ()
> outputXml = BSL.putStrLn (xrender genXml')

Loading the current file into ghci and evaluating outputXml produces the following result:

*Main> outputXml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<people
><person age="32"
>Stefan</person
><person age="4"
>Judith</person
></people
>

This blog post introduced most but not all features of the xmlgen API. Check out the documentation.

Happy hacking and have fun!

Author: Stefan Wehr