html-entity-0.1.0.0: HTML entity decoding and encoding for Text

Safe HaskellSafe
LanguageHaskell2010

Text.HTMLEntity

Description

Efficient decoding and encoding of HTML entities in text.

Synopsis

Documentation

>>> :set -XOverloadedStrings
>>> import qualified Data.Text.IO as T

decode :: Text -> Either String Text Source #

Decode HTML entities contained in the given text. Returns Left decodeError on failure. The parser will do its best to explain the problem.

>>> mapM_ T.putStrLn $ decode "Héllo w⊛rld!"
Héllo w⊛rld!
>>> decode "&NonExistentEntity;"
Left "entity: Failed reading: Unknown entity name NonExistentEntity"
>>> decode "�"
Left "entity: Failed reading: 100000000 is out of Char range"
>>> decode "�"
Left "entity: Failed reading: 4294967295 is out of Char range"

decode' :: Text -> Text Source #

Like decode, except that if a decode error occurs, the original output is returned unmodified. Use if you're certain that your input is well-formed.

>>> T.putStrLn $ decode' "W≐ll-formed inpu⊨"
W≐ll-formed inpu⊨
>>> T.putStrLn $ decode' "Utter n�ns&CurlyE;nse"
Utter n�ns&CurlyE;nse

encode :: Text -> Text Source #

Encodes the input for use as text in an HTML document.

encode will use named entities where possible, except for most symbols in the ASCII block, where it was deemed this would result in unnecessarily bloated output.

>>> T.putStrLn $ encode "Héllo wörld!"
Héllo wörld!
>>> T.putStrLn $ encode "x ≂̸ y"
x ≂̸ y
>>> T.putStrLn $ encode "\2534\6188"
০ᠬ