xml-tydom-conduit-0.1.0.0: Typed XML encoding for an xml-conduit backend.

Safe HaskellNone
LanguageHaskell2010

Text.XML.TyDom.Conduit

Contents

Synopsis

Guide

Quick start

xml-tydom is a library for expressing XML representations using Haskell data types. The serialization to and from XML is done automatically using GHC Generics and (optionally) some Template Haskell. A good way to illustrate this is with a quick example.

We start with a Haskell data type that describes the XML structure we want:

{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics (Generic)

data Person = Person
    { id      :: Attr Int      -- an attribute
    , name    :: Child Text    -- a child element containing text
    , comment :: Content Text  -- a child text content node
    } deriving (Show, Generic)

Then we use GHC Generics to write instances of ToElem and FromElem for the Person type (you can probably guess what these do):

instance ToElem Person where
    toElem = genericToElem defaultOptionsElement
instance FromElem Person where
    fromElem = genericFromElem defaultOptionsElement

With these typeclass instances available, we can serialize a value of type Person to Text containing XML, and also read back the generated Text:

>>> person = Person (Attr 42) (Child "Joe") (Content "XML4Joe!")

>>> text = render $ toElem person
>>> text
"<Person id=\"42\"><name>Joe</name>XML4Joe!</Person>"

>>> personResult = (parse text >>= fromElem) :: Result Person
Success
    (Person
        { id      = Attr    { unAttr    = 42         }
        , name    = Child   { unChild   = "Joe"      }
        , comment = Content { unContent = "XML4Joe!" }
        })

XML text

Textual content in XML documents can appear as either attributes or text content nodes within elements. The conversion of types to and from text is controlled by a pair of typeclasses:

ToXText a
Converts type a to Text.
FromXText a
Converts Text to Either XTextError a.

It is recommended that instances of these typeclasses should be written manually for most user-defined types.

Available encodings

The following types exist to represent parts of the XML DOM:

{ selectorName = Attr a }
A value of type a will become an attribute of the element, containing the textual representation of a. The name of the attribute is specified by the selectorName, which must be supplied for the field.
{ selectorName = Child a }
A value of type a will become a child element. The name of the child element is specified by the selectorName, which must be supplied for the field. The child element will contain a text node containing the textual representation of a.
{ selectorName = Content a }
A value of type a will become a text node of the element. The selectorName is not used in the encoding to XML, and is optional.
{ selectorName = a }
Value a will become a child element. The selectorName is optional and is not used in the encoding to XML. There must be an appropriate instance of ToElem and / or FromElem for the type a.

In addition to these wrappers in their basic form, they can also be combined with Maybe and lists to create optional and list DOM parts. The following combinations are supported automatically:

The case of Attr [a] is not supported because there is no obvious encoding for more than one value of an attribute. Similarly, Content [a] would be problematic because a list of text content nodes could not be separated from each other trivially. To encode lists in attributes or text content, instances of ToXText [a] / FromXText [a] can be supplied for type a that can handle case-specific encoding.

Newtype aliasing

In addition to the basic encoding types (Attr, Child and Content), it is possible to alias an entire element using a newtype. An instance for a newtype created using genericToElem / genericFromElem will use the encoding for the wrapped type with the name of the newtype constructor.

For example:

data    Port   = Port   { Content Int      } deriving (Show, Generic)
newtype InPort = InPort { unInPort :: Port } deriving (Show, Generic)

opt = defaultOptionsElement
instance ToElem Port   where toElem = genericToElem opt
instance ToElem InPort where toElem = genericToElem opt

>>> render $ toElem (Port (Content 443))
"<Port>443</Port>" 
>>> render $ toElem (InPort (Port (Content 443)))
"<InPort>443</InPort>"

Sum types

The name of an element is always specified by the name of the constructor in Haskell. Sum types, with multiple constructors, are also supported in a straightforward way. These can represent cases where one element can be chosen from a selection of elements (ie. <xsd:choice> in an XML schema).

For example:

data Ref = Id   { id   :: Attr    Int  }
         | Name { name :: Content Text }
         deriving (Show, Generic)

opt = defaultOptionsElement
instance ToElem Ref where toElem = genericToElem opt
instance FromElem Ref where fromElem = genericFromElem opt

>>> text = render $ toElem (Name (Content Martok))
>>> text
"<Name>Martok</Name>"

>>> refResult = (parse text >>= fromElem) :: Result Ref
Success (Name { name = Content { unContent = "Martok" } })

Encoding options

Several options exist for the encoding. These are specified by OptionsElement, which is passed as an argument to genericToElem and genericFromElem. The following can be specified:

  • Bijections from constructor and selector names to element names and attribute names.
  • Whether nodes should be sequential (ie. <xsd:sequence/>) or can appear in any order when reading. Nodes are always written sequentially.
  • Whether an error is produced when extra attributes or nodes exist in the XML but not in the Haskell datatype.

The naming bijections are particularly useful because often a particular XML schema will require names that are not directly representable as Haskell constructors or selectors. For example, XML names may start with lowercase characters, or they may require hyphens, or namespaces. In these instances, a function can be provided which must perform a bijective (one-to-one) mapping between the textual representation of an element or attribute name and its required XML name. As a simple example, we may want to drop the first two characters of a selector name:

import Data.Text (Text)
import qualified Data.Text as T
import qualified Text.XML as XML  -- xml-conduit

opt = defaultOptionsElement { optAttrName = attrName }

attrName = Text -> AttrName
attrName selectorName
    = AttrName $ XML.Name (T.drop 2 selectorName) Nothing Nothing

data Address = Address { adName :: Attr Text } deriving (Show, Generic)
                  --->   ^^ - drop these two letters from the attribute name
instance ToElem Address where toElem = genericToElem opt

>>> render $ toElem (Address (Attr "Josephine Citizen"))
"<Address Name=\"Josephine Citizen\"/>"

Both AttrName and ElemName are newtype wrappers around XML.Name.

Separating encodings

Specifying the encoding in Haskell is clumsy, because types are littered with mentions of Attr, Child and Content. From a practical perspective, these are quite ugly if they appear in the application's data model. They conflate the concerns of data representation and serialization, which should be separate.

We can improve this situation by using one type for the application's own data model and a separate type for the encoding. xml-tydom provides some Template Haskell support to ease this process. For example:

{-# LANGUAGE TemplateHaskell #-}
import Text.XML.TyDom.Conduit.TH (makeEncoding)

-- Data type for the application (plain; no Attr, Child or Content)
data Address = Address
    { name   :: Text
    , street :: Text
    , city   :: Text
    , zip    :: Int
    } deriving (Show, Generic)

-- Data type specifying the encoding. This must have the same form as the
-- application data type, except for mentions of Attr, Child and
-- Content.
data EncAddress = EncAddress
    { encName   :: Child Text
    , encStreet :: Child Text
    , encCity   :: Child Text
    , encZip    :: Child Int
    } deriving (Show, Generic)

-- We need to specify both ToElem and FromElem instances for the
-- encoding type (the Template Haskell operation requires both):
instance ToElem EncAddress where
    toElem = toElem defaultOptionsElement
instance FromElem EncAddress where
    fromElem = fromElem defaultOptionsElement

-- But having done this, we can get Template Haskell to write instances for
-- the application type (Address). Instances are supplied for:
--   - ToElem Address
--   - FromElem Address
--   - Conv Address AddressEnc
--   - Conv AddressEnc Address
$(makeEncoding ''Address ''EncAddress)

If you use this approach, the names of attributes and elements are specified using the encoding type (EncAddress in the above example), and not the application data type. Under the hood, to produce XML, the application data type is first converted to the encoding type (using a Generic converter), and then the encoding type is converted to XML. The reverse process is followed to read from XML. Because the encoding (and thus the OptionsElement) is specified completely by the encoding type, the required ToElem and FromElem instances for the application type are completely unambiguous.

Error handling

Reading from XML to a type can fail. The result of reading from XML is the Result type, which is a disjunction specifying either Success or Failure. In the event of a Failure, the Path to the failed element from the document root is recorded, as is a detailed Cause of the failure. If you want a convenient textual representation of the failure, this can be achieved with the renderFailure function. For example:

import qualified Data.Text.IO as T (putStr)

path  = PathItem (ElemName (XML.Name Root Nothing Nothing)) PathRoot
cause = MissingAttribute (AttrName (XML.Name "myAttr" Nothing Nothing))
>>> T.putStr $ renderFailure (Failure path cause)
Path: Root
Missing attribute [myAttr]

Reading non-sequenced XML

Often, we are faced with reading child elements whose order is not guaranteed. xml-tydom supports this to the greatest extent that is feasible. To enable non-sequential reading, optReadChildOrdering must be set to All in the OptionsElement that is used to generate the FromElem instance. The handling of different cases can be addressed separately:

content
The first text content is accepted.
optional content
If no content is present then this becomes Nothing.
child element
The first child element which succeeds in fromElem is accepted.
optional child element
If no child element succeeds in fromElem then this becomes Nothing.
list of child elements
Every child element which succeeds in fromElem becomes part of the list.

Given these rules, it should become apparent that certain combinations are not valid for elements that are read as All. For example, while a data type like the following is OK for Sequence elements, it will fail for All elements:

-- This will work for a Sequence read, but not an All read
data OnlyOkForSequenced = OnlyOkForSequenced
    { aWidgets :: [Widgets]
    , grommit  :: Grommit
    , bWidgets :: [Widgets]
    } deriving (Show, Generic)

However, similar rules also exist for Sequence reads, although they are somewhat more obvious:

-- This will FAIL to read! DO NOT ACTUALLY USE IT
-- We can't possibly tell where the list of widget child elements ends, so
-- all of them will be consumed, leaving no remaining widget for the final
-- member of the datatype.
data Bad = Bad
    { aWidgets :: [Widgets]
    , widget   :: Widget
    } deriving (Show, Generic)

Classes

class ToElem a where Source #

Minimal complete definition

toElem

Methods

toElem :: a -> Element Source #

class FromElem a where Source #

Minimal complete definition

fromElem

Methods

fromElem :: Element -> Result a Source #

class ToXText a where Source #

Minimal complete definition

toXText

Methods

toXText :: a -> Text Source #

Instances

class Conv p q where #

Typeclass for conversion between types p and q.

This typeclass is particularly used for conversion between raw types and their XML-decorated versions, corresponding to Attr, Child, Content and CData.

Minimal complete definition

conv

Methods

conv :: p -> q #

Converts a value of type p to a value of type q.

Instances

Conv z z 

Methods

conv :: z -> z #

Conv z (Content z) 

Methods

conv :: z -> Content z #

Conv z (Child z) 

Methods

conv :: z -> Child z #

Conv z (CData z) 

Methods

conv :: z -> CData z #

Conv z (Attr z) 

Methods

conv :: z -> Attr z #

Conv (Attr z) z 

Methods

conv :: Attr z -> z #

Conv (Child z) z 

Methods

conv :: Child z -> z #

Conv (Content z) z 

Methods

conv :: Content z -> z #

Conv (CData z) z 

Methods

conv :: CData z -> z #

Types

newtype Attr z :: * -> * #

Attribute.

Specifies that a record field of type Attr z should become an XML attribute. The name of the attribute is specified by the name of the record selector, while the value is the textual representation of the value of type z.

Constructors

Attr 

Fields

Instances

(KnownSymbol name, FromXText t z) => GFromElem e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Attr z)))

S1 (named) + Attr - record selector for an XML attribute.

Methods

gFromElem :: Eq n => OptionsElement n a -> Decompose e n a t d -> d -> Result e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Attr z)) r, d)

(KnownSymbol name, FromXText t z) => GFromElem e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Attr (Maybe z))))

S1 (named) + Attr Maybe - record selector for an optional XML attribute.

Methods

gFromElem :: Eq n => OptionsElement n a -> Decompose e n a t d -> d -> Result e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Attr (Maybe z))) r, d)

(KnownSymbol name, ToXText t z) => GToElem e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Attr z)))

S1 (named) + Attr - record selector for an XML attribute.

Methods

gToElem :: OptionsElement n a -> Compose e n a t c -> S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Attr z)) r -> c -> c

(KnownSymbol name, ToXText t z) => GToElem e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Attr (Maybe z))))

S1 (named) + Attr Maybe - record selector for optional XML attribute.

Methods

gToElem :: OptionsElement n a -> Compose e n a t c -> S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Attr (Maybe z))) r -> c -> c

Conv z (Attr z) 

Methods

conv :: z -> Attr z #

Eq z => Eq (Attr z) 

Methods

(==) :: Attr z -> Attr z -> Bool #

(/=) :: Attr z -> Attr z -> Bool #

Show z => Show (Attr z) 

Methods

showsPrec :: Int -> Attr z -> ShowS #

show :: Attr z -> String #

showList :: [Attr z] -> ShowS #

Arbitrary z => Arbitrary (Attr z) 

Methods

arbitrary :: Gen (Attr z) #

shrink :: Attr z -> [Attr z] #

Conv (Attr z) z 

Methods

conv :: Attr z -> z #

newtype Child z :: * -> * #

Child (containing only text).

Specifies that a record field of type Child z should become a child element of the current element, containing the textual representation of the value of type z.

Constructors

Child 

Fields

Instances

(KnownSymbol name, FromXText t z) => GFromElem e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Child z)))

S1 (named) + Child - record selector for a simple child element with text content.

Methods

gFromElem :: Eq n => OptionsElement n a -> Decompose e n a t d -> d -> Result e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Child z)) r, d)

(KnownSymbol name, FromXText t z) => GFromElem e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Child (Maybe z))))

S1 (named) + Child Maybe - record selector for a simple optional child element with text content.

Methods

gFromElem :: Eq n => OptionsElement n a -> Decompose e n a t d -> d -> Result e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Child (Maybe z))) r, d)

(KnownSymbol name, FromXText t z) => GFromElem e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Child [z])))

S1 (named) + [Child] - record selector for a list of child elements with text content.

Methods

gFromElem :: Eq n => OptionsElement n a -> Decompose e n a t d -> d -> Result e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Child [z])) r, d)

(KnownSymbol name, ToXText t z) => GToElem e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Child z)))

S1 (named) + Child - record selector for a simple child element with text.

Methods

gToElem :: OptionsElement n a -> Compose e n a t c -> S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Child z)) r -> c -> c

(KnownSymbol name, ToXText t z) => GToElem e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Child (Maybe z))))

S1 (named) + Child Maybe - record selector for an optional simple child element with text.

Methods

gToElem :: OptionsElement n a -> Compose e n a t c -> S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Child (Maybe z))) r -> c -> c

(KnownSymbol name, ToXText t z) => GToElem e n a t (S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Child [z])))

S1 (named) + [Child] - record selector for a list of simple child elements with text.

Methods

gToElem :: OptionsElement n a -> Compose e n a t c -> S1 (MetaSel (Just Symbol name) g h i) (Rec0 (Child [z])) r -> c -> c

Conv z (Child z) 

Methods

conv :: z -> Child z #

Eq z => Eq (Child z) 

Methods

(==) :: Child z -> Child z -> Bool #

(/=) :: Child z -> Child z -> Bool #

Show z => Show (Child z) 

Methods

showsPrec :: Int -> Child z -> ShowS #

show :: Child z -> String #

showList :: [Child z] -> ShowS #

Arbitrary z => Arbitrary (Child z) 

Methods

arbitrary :: Gen (Child z) #

shrink :: Child z -> [Child z] #

Conv (Child z) z 

Methods

conv :: Child z -> z #

newtype Content z :: * -> * #

Content node.

Specifies that a record field of type Content z should become a content node of the current element, containing the textual representation of the value of type z.

Constructors

Content 

Fields

Instances

FromXText t z => GFromElem e n a t (S1 q (Rec0 (Content z)))

S1 (named or unnamed) + Content - record selector for a content child.

Methods

gFromElem :: Eq n => OptionsElement n a -> Decompose e n a t d -> d -> Result e n a t (S1 q (Rec0 (Content z)) r, d)

FromXText t z => GFromElem e n a t (S1 q (Rec0 (Content (Maybe z))))

S1 (named or unnamed) + Content Maybe - record selector for an optional content child.

Methods

gFromElem :: Eq n => OptionsElement n a -> Decompose e n a t d -> d -> Result e n a t (S1 q (Rec0 (Content (Maybe z))) r, d)

ToXText t z => GToElem e n a t (S1 q (Rec0 (Content z)))

S1 (named or unnamed) + Content - record selector for a content node.

Methods

gToElem :: OptionsElement n a -> Compose e n a t c -> S1 q (Rec0 (Content z)) r -> c -> c

ToXText t z => GToElem e n a t (S1 q (Rec0 (Content (Maybe z))))

S1 (named or unnamed) + Content Maybe - record selector for an optional content node.

Methods

gToElem :: OptionsElement n a -> Compose e n a t c -> S1 q (Rec0 (Content (Maybe z))) r -> c -> c

Conv z (Content z) 

Methods

conv :: z -> Content z #

Eq z => Eq (Content z) 

Methods

(==) :: Content z -> Content z -> Bool #

(/=) :: Content z -> Content z -> Bool #

Show z => Show (Content z) 

Methods

showsPrec :: Int -> Content z -> ShowS #

show :: Content z -> String #

showList :: [Content z] -> ShowS #

Arbitrary z => Arbitrary (Content z) 

Methods

arbitrary :: Gen (Content z) #

shrink :: Content z -> [Content z] #

Conv (Content z) z 

Methods

conv :: Content z -> z #

newtype XTextError :: * #

Error which may occur when parsing XML text.

Constructors

XTextError Text 

data Result a Source #

Constructors

Success a 
Failure Path Cause 

Instances

Monad Result Source # 

Methods

(>>=) :: Result a -> (a -> Result b) -> Result b #

(>>) :: Result a -> Result b -> Result b #

return :: a -> Result a #

fail :: String -> Result a #

Functor Result Source # 

Methods

fmap :: (a -> b) -> Result a -> Result b #

(<$) :: a -> Result b -> Result a #

Applicative Result Source # 

Methods

pure :: a -> Result a #

(<*>) :: Result (a -> b) -> Result a -> Result b #

(*>) :: Result a -> Result b -> Result b #

(<*) :: Result a -> Result b -> Result a #

Eq a => Eq (Result a) Source # 

Methods

(==) :: Result a -> Result a -> Bool #

(/=) :: Result a -> Result a -> Bool #

Show a => Show (Result a) Source # 

Methods

showsPrec :: Int -> Result a -> ShowS #

show :: Result a -> String #

showList :: [Result a] -> ShowS #

data Path Source #

Instances

Eq Path Source # 

Methods

(==) :: Path -> Path -> Bool #

(/=) :: Path -> Path -> Bool #

Show Path Source # 

Methods

showsPrec :: Int -> Path -> ShowS #

show :: Path -> String #

showList :: [Path] -> ShowS #

data ReadNodeOrdering :: * #

Specifies how child nodes should be treated when reading a type from an element.

Constructors

Sequence

Child nodes should be read in strict sequence (ie. <xsd:sequence>).

All

Child nodes can appear in any order (ie. <xsd:all>).

data ReadLeftovers :: * #

Specifies how any left-over parts of an element should be treated when reading a type from an element.

Constructors

LeftoversOK

Left-over parts of an element are OK, and not an error.

LeftoversError

Left-over parts of an element should produce an error.

Generics

genericToElem :: (Generic z, GToElem Element ElemName AttrName Text (Rep z)) => OptionsElement -> z -> Element Source #

Generic producer for ToElem instances.

genericConv :: (Generic a, Generic b, GConv (Rep a) (Rep b)) => a -> b #

Generic producer for a Conv instance.

Functions

unAttr :: Attr z -> z #

unChild :: Child z -> z #

unContent :: Content z -> z #

xTextErrType :: Text -> Text -> Either XTextError a #

Formats an XTextError string and returns it as a Left instance.

Orphan instances

ToElem a => ToElem Element a Source # 

Methods

toElem :: a -> Element #

ToXText a => ToXText Text a Source # 

Methods

toXText :: a -> Text #

FromXText a => FromXText Text a Source #