This module defines the notion of filters and filter combinators for processing XML documents.
These XML transformation combinators are described in the paper ``Haskell and XML: Generic Combinators or Type-Based Translation?'' Malcolm Wallace and Colin Runciman, Proceedings ICFP'99.
- type CFilter = Content -> [Content]
- keep :: a -> [a]
- none :: a -> [a]
- children :: CFilter
- position :: Int -> CFilter -> CFilter
- elm :: CFilter
- txt :: CFilter
- tag :: String -> CFilter
- attr :: Name -> CFilter
- attrval :: Attribute -> CFilter
- tagWith :: (String -> Bool) -> CFilter
- find :: String -> (String -> CFilter) -> CFilter
- iffind :: String -> (String -> CFilter) -> CFilter -> CFilter
- ifTxt :: (String -> CFilter) -> CFilter -> CFilter
- o :: CFilter -> CFilter -> CFilter
- union :: (a -> [b]) -> (a -> [b]) -> a -> [b]
- cat :: [a -> [b]] -> a -> [b]
- andThen :: (a -> c) -> (c -> a -> b) -> a -> b
- (|>|) :: (a -> [b]) -> (a -> [b]) -> a -> [b]
- with :: CFilter -> CFilter -> CFilter
- without :: CFilter -> CFilter -> CFilter
- (/>) :: CFilter -> CFilter -> CFilter
- (</) :: CFilter -> CFilter -> CFilter
- et :: (String -> CFilter) -> CFilter -> CFilter
- path :: [CFilter] -> CFilter
- deep :: CFilter -> CFilter
- deepest :: CFilter -> CFilter
- multi :: CFilter -> CFilter
- when :: CFilter -> CFilter -> CFilter
- guards :: CFilter -> CFilter -> CFilter
- chip :: CFilter -> CFilter
- foldXml :: CFilter -> CFilter
- mkElem :: String -> [CFilter] -> CFilter
- mkElemAttr :: String -> [(String, CFilter)] -> [CFilter] -> CFilter
- literal :: String -> CFilter
- cdata :: String -> CFilter
- replaceTag :: String -> CFilter
- replaceAttrs :: [(String, String)] -> CFilter
- data ThenElse a = a :> a
- (?>) :: (a -> [b]) -> ThenElse (a -> [b]) -> a -> [b]
- type LabelFilter a = Content -> [(a, Content)]
- oo :: (a -> CFilter) -> LabelFilter a -> CFilter
- x :: (CFilter -> LabelFilter a) -> (CFilter -> LabelFilter b) -> CFilter -> LabelFilter (a, b)
- numbered :: CFilter -> LabelFilter String
- interspersed :: String -> CFilter -> String -> LabelFilter String
- tagged :: CFilter -> LabelFilter String
- attributed :: String -> CFilter -> LabelFilter String
- textlabelled :: CFilter -> LabelFilter (Maybe String)
- extracted :: (Content -> a) -> CFilter -> LabelFilter a
The content filter type.
Simple filters.
Selection filters.
In the algebra of combinators, none
is the zero, and keep
the identity.
(They have a more general type than just CFilter.)
Predicate filters.
These filters either keep or throw away some content based on
a simple test. For instance, elm
keeps only a tagged element,
txt
keeps only non-element text, tag
keeps only an element
with the named tag, attr
keeps only an element with the named
attribute, attrval
keeps only an element with the given
attribute value, tagWith
keeps only an element whose tag name
satisfies the given predicate.
Search filters.
find :: String -> (String -> CFilter) -> CFilterSource
For a mandatory attribute field, find key cont
looks up the value of
the attribute name key
, and applies the continuation cont
to
the value.
iffind :: String -> (String -> CFilter) -> CFilter -> CFilterSource
When an attribute field may be absent, use iffind key yes no
to lookup
its value. If the attribute is absent, it acts as the no
filter,
otherwise it applies the yes
filter.
ifTxt :: (String -> CFilter) -> CFilter -> CFilterSource
ifTxt yes no
processes any textual content with the yes
filter,
but otherwise is the same as the no
filter.
Filter combinators
Basic combinators.
union :: (a -> [b]) -> (a -> [b]) -> a -> [b]Source
Binary parallel composition. Each filter uses a copy of the input, rather than one filter using the result of the other. (Has a more general type than just CFilter.)
cat :: [a -> [b]] -> a -> [b]Source
Glue a list of filters together. (A list version of union; also has a more general type than just CFilter.)
andThen :: (a -> c) -> (c -> a -> b) -> a -> bSource
A special form of filter composition where the second filter works over the same data as the first, but also uses the first's result.
(|>|) :: (a -> [b]) -> (a -> [b]) -> a -> [b]Source
Directional choice:
in f |>| g
give g-productions only if no f-productions
with :: CFilter -> CFilter -> CFilterSource
Pruning: in f
,
keep only those f-productions which have at least one g-production
with
g
without :: CFilter -> CFilter -> CFilterSource
Pruning: in f
,
keep only those f-productions which have no g-productions
without
g
et :: (String -> CFilter) -> CFilter -> CFilterSource
Join an element-matching filter with a text-only filter
path :: [CFilter] -> CFilterSource
Express a list of filters like an XPath query, e.g.
path [children, tag "name1", attr "attr1", children, tag "name2"]
is like the XPath query /name1[@attr1]/name2
.
Recursive search.
Recursive search has three variants: deep
does a breadth-first
search of the tree, deepest
does a depth-first search, multi
returns
content at all tree-levels, even those strictly contained within results
that have already been returned.
Interior editing.
when :: CFilter -> CFilter -> CFilterSource
Interior editing:
f
applies when
gf
only when the predicate g
succeeds,
otherwise the content is unchanged.
guards :: CFilter -> CFilter -> CFilterSource
Interior editing:
g
applies guards
ff
only when the predicate g
succeeds,
otherwise the content is discarded.
chip :: CFilter -> CFilterSource
Process CHildren In Place. The filter is applied to any children of an element content, and the element rebuilt around the results.
foldXml :: CFilter -> CFilterSource
Recursive application of filters: a fold-like operator. Defined
as f
.
o
chip (foldXml f)
Constructive filters.
mkElem :: String -> [CFilter] -> CFilterSource
Build an element with the given tag name - its content is the results of the given list of filters.
mkElemAttr :: String -> [(String, CFilter)] -> [CFilter] -> CFilterSource
Build an element with the given name, attributes, and content.
replaceTag :: String -> CFilterSource
Rename an element tag.
replaceAttrs :: [(String, String)] -> CFilterSource
Replace the attributes of an element.
C-like conditionals.
These definitions provide C-like conditionals, lifted to the filter level.
The (cond ? yes : no)
style in C becomes (cond ?> yes :> no)
in Haskell.
(?>) :: (a -> [b]) -> ThenElse (a -> [b]) -> a -> [b]Source
Select between the two branches of a joined conditional.
Filters with labelled results.
type LabelFilter a = Content -> [(a, Content)]Source
A LabelFilter is like a CFilter except that it pairs up a polymorphic value (label) with each of its results.
Using and combining labelled filters.
oo :: (a -> CFilter) -> LabelFilter a -> CFilterSource
Compose a label-processing filter with a label-generating filter.
x :: (CFilter -> LabelFilter a) -> (CFilter -> LabelFilter b) -> CFilter -> LabelFilter (a, b)Source
Combine labels. Think of this as a pair-wise zip on labels.
Some label-generating filters.
numbered :: CFilter -> LabelFilter StringSource
Number the results from 1 upwards.
interspersed :: String -> CFilter -> String -> LabelFilter StringSource
In interspersed a f b
, label each result of f
with the string a
,
except for the last one which is labelled with the string b
.
tagged :: CFilter -> LabelFilter StringSource
Label each element in the result with its tag name. Non-element results get an empty string label.
attributed :: String -> CFilter -> LabelFilter StringSource
Label each element in the result with the value of the named attribute. Elements without the attribute, and non-element results, get an empty string label.
textlabelled :: CFilter -> LabelFilter (Maybe String)Source
Label each textual part of the result with its text. Element results get an empty string label.
extracted :: (Content -> a) -> CFilter -> LabelFilter aSource
Label each content with some information extracted from itself.