| Safe Haskell | None |
|---|---|
| Language | Haskell2010 |
Text.XML.Light.Extractors
Description
Functions to extract data from parsed XML.
Example
Suppose you have an xml file of books like this:
<?xml version="1.0"?>
<library>
<book id="1" isbn="23234-1">
<author>John Doe</author>
<title>Some book</title>
</book>
<book id="2">
<author>You</author>
<title>The Great Event</title>
</book>
...
</library>And a data type for a book:
data Book = Book { bookId :: Int
, isbn :: Maybe String
, author, title :: String
}You can parse the xml file into a generic tree structure using
parseXMLDoc from the xml package.
Using this library one can define extractors to extract Books from the generic tree.
book =element"book" $ do i <-attribAs"id"integers <-optional(attrib"isbn")children$ do a <-element"author" $contents$textt <-element"title" $contents$textreturn Book { bookId = i, author = a, title = t, isbn = s } library =element"library" $children$only$manybook extractLibrary ::Element->EitherExtractionErr[Book] extractLibrary =extractDocContentslibrary
Notes
- The
onlycombinator can be used to exhaustively extract contents.
- The Control.Applicative module contains some useful
combinators like
optional,manyand<|>. - The Text.XML.Light.Extractors.ShowErr contains some predefined functions to convert error values to strings.
Synopsis
- type Path = [String]
- data Err
- = ErrExpectContent { }
- | ErrExpectAttrib { }
- | ErrAttribValue { }
- | ErrEnd { }
- | ErrNull { }
- | ErrMsg String
- data ExtractionErr = ExtractionErr {}
- data ElementExtractor a
- extractElement :: ElementExtractor a -> Element -> Either ExtractionErr a
- attrib :: String -> ElementExtractor String
- attribAs :: String -> (String -> Either String a) -> ElementExtractor a
- children :: ContentsExtractor a -> ElementExtractor a
- contents :: ContentsExtractor a -> ElementExtractor a
- data ContentsExtractor a
- extractContents :: ContentsExtractor a -> [Content] -> Either ExtractionErr a
- extractDocContents :: ContentsExtractor a -> Element -> Either ExtractionErr a
- element :: String -> ElementExtractor a -> ContentsExtractor a
- text :: ContentsExtractor String
- textAs :: (String -> Either Err a) -> ContentsExtractor a
- choice :: [ContentsExtractor a] -> ContentsExtractor a
- anyContent :: ContentsExtractor Content
- eoc :: ContentsExtractor ()
- only :: ContentsExtractor a -> ContentsExtractor a
- showExtractionErr :: ExtractionErr -> String
- eitherMessageOrValue :: Either ExtractionErr a -> Either String a
- integer :: (Integral a, Read a) => String -> Either String a
- float :: (Floating a, Read a) => String -> Either String a
Errors
Location for some content.
For now it is a reversed list of content indices (starting at 1) and element names. This may change to something less "stringly typed".
Extraction errors.
Constructors
| ErrExpectContent | Some expected content is missing |
Fields | |
| ErrExpectAttrib | An expected attribute is missing |
Fields
| |
| ErrAttribValue | An attribute value was bad |
Fields
| |
| ErrEnd | Expected end of contents |
Fields | |
| ErrNull | Unexpected end of contents |
Fields | |
| ErrMsg String | |
data ExtractionErr Source #
Error with a context.
Constructors
| ExtractionErr | |
Instances
| Show ExtractionErr Source # | |
Defined in Text.XML.Light.Extractors.Internal Methods showsPrec :: Int -> ExtractionErr -> ShowS # show :: ExtractionErr -> String # showList :: [ExtractionErr] -> ShowS # | |
| Error ExtractionErr Source # | |
Defined in Text.XML.Light.Extractors.Internal | |
Element extraction
data ElementExtractor a Source #
Instances
extractElement :: ElementExtractor a -> Element -> Either ExtractionErr a Source #
extractElement p element extracts element with p.
attrib :: String -> ElementExtractor String Source #
attrib name extracts the value of attribute name.
attribAs :: String -> (String -> Either String a) -> ElementExtractor a Source #
attribAs name f extracts the value of attribute name and runs
it through a conversion/validation function.
The conversion function takes a string with the value and returns either a description of the expected format of the value or the converted value.
children :: ContentsExtractor a -> ElementExtractor a Source #
children p extract only child elements with p.
contents :: ContentsExtractor a -> ElementExtractor a Source #
contents p extract contents with p.
Contents extraction
data ContentsExtractor a Source #
Instances
extractContents :: ContentsExtractor a -> [Content] -> Either ExtractionErr a Source #
extractContents p contents extracts the contents with p.
extractDocContents :: ContentsExtractor a -> Element -> Either ExtractionErr a Source #
Using parseXMLDoc produces a single
Element. Such an element can be extracted using this function.
element :: String -> ElementExtractor a -> ContentsExtractor a Source #
element name p extracts a name element with p.
text :: ContentsExtractor String Source #
Extracts text.
textAs :: (String -> Either Err a) -> ContentsExtractor a Source #
Extracts text applied to a conversion function.
choice :: [ContentsExtractor a] -> ContentsExtractor a Source #
Extracts first matching.
anyContent :: ContentsExtractor Content Source #
Extracts one Content item.
eoc :: ContentsExtractor () Source #
Succeeds only when there is no more content.
only :: ContentsExtractor a -> ContentsExtractor a Source #
only p fails if there is more contents than extracted by p.
only p = p <* eoc
Utils
showExtractionErr :: ExtractionErr -> String Source #
Converts an extraction error to a multi line string message.
Paths are shown according to showPath.
eitherMessageOrValue :: Either ExtractionErr a -> Either String a Source #
Convenience function to convert extraction errors to string
messages using showExtractionErr.
eitherMessageOrValue = either (Left . showExtractionErr) Right