XML editing filters
- canonicalizeTree :: XmlFilter -> XmlFilter
- canonicalizeAllNodes :: XmlFilter
- canonicalizeForXPath :: XmlFilter
- collapseXText :: XmlFilter
- collapseAllXText :: XmlFilter
- indentDoc :: XmlFilter
- removeWhiteSpace :: XmlFilter
- removeAllWhiteSpace :: XmlFilter
- removeDocWhiteSpace :: XmlFilter
- removeComment :: XmlFilter
- removeAllComment :: XmlFilter
- transfCdata :: XmlFilter
- transfAllCdata :: XmlFilter
- transfCdataEscaped :: XmlFilter
- transfAllCdataEscaped :: XmlFilter
- transfCharRef :: XmlFilter
- transfAllCharRef :: XmlFilter
- escapeXmlDoc :: XmlFilter
- escapeXmlText :: XmlFilter
- escapeXmlAttrValue :: XmlFilter
- unparseXmlDoc :: XmlFilter
- numberLinesInXmlDoc :: XmlFilter
- numberLines :: String -> String
- treeRepOfXmlDoc :: XmlFilter
- haskellRepOfXmlDoc :: XmlFilter
- addHeadlineToXmlDoc :: XmlFilter
- addXmlPiToDoc :: XmlFilter
Documentation
canonicalizeTree :: XmlFilter -> XmlFilterSource
Applies some Canonical XML rules to the nodes of a tree.
The rule differ slightly for canonical XML and XPath in handling of comments
Note: This is not the whole canonicalization as it is specified by the W3C
Recommendation. Adding attribute defaults or sorting attributes in lexicographic
order is done by the transform
function of module Text.XML.HXT.Validator.Validation
.
Replacing entities or line feed normalization is done by the parser.
Not implemented yet:
- Whitespace within start and end tags is normalized
- Special characters in attribute values and character content are replaced by character references
canonicalizeAllNodes :: XmlFilterSource
canonicalize tree and remove comments and <?xml ... ?> declarations
see canonicalizeTree
canonicalizeForXPath :: XmlFilterSource
Canonicalize a tree for XPath Comment nodes are not removed
see canonicalizeTree
collapseXText :: XmlFilterSource
Collects sequences of child XText nodes into one XText node.
collapseAllXText :: XmlFilterSource
Applies collapseXText recursively.
see also : collapseXText
filter for indenting a document tree for pretty printing.
the tree is traversed for inserting whitespace for tag indentation.
whitespace is only inserted or changed at places, where it isn't significant, is's not inserted between tags and text containing non whitespace chars.
whitespace is only inserted or changed at places, where it's not significant.
preserving whitespace may be controlled in a document tree
by a tag attribute xml:space
allowed values for this attribute are default | preserve
.
input is a complete document tree. result the semantically equivalent formatted tree.
see also : removeDocWhiteSpace
removeWhiteSpace :: XmlFilterSource
simple filter for removing whitespace.
no check on sigificant whitespace is done.
see also : removeAllWhiteSpace
, removeDocWhiteSpace
removeAllWhiteSpace :: XmlFilterSource
simple recursive filter for removing all whitespace.
removes all text nodes in a tree that consist only of whitespace.
see also : removeWhiteSpace
, removeDocWhiteSpace
removeDocWhiteSpace :: XmlFilterSource
filter for removing all not significant whitespace.
the tree traversed for removing whitespace between tags,
that was inserted for indentation and readability.
whitespace is only removed at places, where it's not significat
preserving whitespace may be controlled in a document tree
by a tag attribute xml:space
allowed values for this attribute are default | preserve
input is root node of the document to be cleaned up output the semantically equivalent simplified tree
see also : indentDoc
, removeAllWhiteSpace
removeComment :: XmlFilterSource
remove Comments
removeAllComment :: XmlFilterSource
remove all Comments recursively
transfCdata :: XmlFilterSource
converts CDATA section in normal text sections
transfAllCdata :: XmlFilterSource
converts CDATA sections in whole document tree
transfCdataEscaped :: XmlFilterSource
converts CDATA section in normal text nodes
transfAllCdataEscaped :: XmlFilterSource
converts CDATA sections in whole document tree into normal text nodes
transfCharRef :: XmlFilterSource
converts character references to normal text
transfAllCharRef :: XmlFilterSource
recursively converts all character references to normal text
escapeXmlDoc :: XmlFilterSource
convert the special XML chars ", <, >, & and ' in a document to char references,
attribute values are converted with escapeXmlAttrValue
see also: escapeXmlText
, escapeXmlAttrValue
escapeXmlText :: XmlFilterSource
convert the special XML chars in a text or comment node into character references
see also escapeXmlDoc
escapeXmlAttrValue :: XmlFilterSource
convert the special XML chars in an attribute value into charachter references. Not only the XML specials but also \n, \r and \t are converted
see also: escapeXmlDoc
, escapeXmlText
unparseXmlDoc :: XmlFilterSource
convert a document tree into an output string representation with respect to the output encoding.
The children of the document root are stubstituted by a single text node for the text representation of the document.
Encoding of the document is performed with respect
to the output-encoding
attribute in the root node, or if not present,
of the encoding
attribute for the original input encoding.
If the encoding is not specified or not supported, UTF-8 is taken.
numberLinesInXmlDoc :: XmlFilterSource
convert a document into a text and add line numbers to the text representation.
Result is a root node with a single text node as child.
Useful for debugging and trace output.
see also : haskellRepOfXmlDoc
, treeRepOfXmlDoc
numberLines :: String -> StringSource
treeRepOfXmlDoc :: XmlFilterSource
convert a document into a text representation in tree form.
Useful for debugging and trace output.
see also : haskellRepOfXmlDoc
, numberLinesInXmlDoc
haskellRepOfXmlDoc :: XmlFilterSource
convert a document into a Haskell representation (with show).
Useful for debugging and trace output.
see also : treeRepOfXmlDoc
, numberLinesInXmlDoc