foldl-transduce-0.1.2.1: Transducers for foldl folds.

Safe HaskellSafe
LanguageHaskell98

Control.Foldl.Transduce.Text

Contents

Description

This module builds on module Control.Foldl.Text, adding stateful transducers and grouping operations.

Synopsis

Decoding transducers

decoder :: (ByteString -> Decoding) -> OnDecodeError -> Transducer ByteString Text () Source

Builds a decoding Transducer out of a stream-oriented decoding function from Data.Text.Encoding and an error handler from Data.Text.Encoding.Error.

utf8 :: OnDecodeError -> Transducer ByteString Text () Source

Builds a UTF8-decoding Transducer. Takes an error handler from Data.Text.Encoding.Error.

utf8lenient :: Transducer ByteString Text () Source

UTF8-decoding Transducer that replaces invalid input bytes with the Unicode replacement character U+FFFD.

>>> L.fold (transduce utf8lenient L.list) (map fromString ["decode","this"])
["decode","this"]
>>> L.fold (transduce utf8lenient L.list) (map fromString ["across \xe2","\x98\x83 boundaries"])
["across ","\9731 boundaries"]
>>> L.fold (transduce utf8lenient L.list) (map fromString ["invalid \xc3\x28 sequence"])
["invalid \65533 sequence"]
>>> L.fold (transduce utf8lenient L.list) (map fromString ["incomplete \xe2"])
["incomplete ","\65533"]

utf8strict :: Transducer ByteString Text () Source

BEWARE! This Transducer may throw UnicodeException. BEWARE!

>>> L.fold (transduce utf8strict L.list) (map fromString ["invalid \xc3\x28 sequence"])
*** Exception: Cannot decode byte '\x28': Data.Text.Internal.Encoding.streamDecodeUtf8With: Invalid UTF-8 stream
>>> L.fold (transduce utf8strict L.list) (map fromString ["incomplete \xe2"])
*** Exception: Cannot decode input: leftovers

decoderE :: MonadIO m => (OnDecodeError -> ByteString -> Decoding) -> TransducerM (ExceptT UnicodeException m) ByteString Text () Source

Similar to decoder, but catches UnicodeException in IO and uses Except to communicate the error.

utf8E :: MonadIO m => TransducerM (ExceptT UnicodeException m) ByteString Text () Source

Like utf8strict, but catches UnicodeException in IO and uses Except to communicate the error.

>>> runExceptT $ L.foldM (transduceM utf8E (L.generalize L.list)) (map fromString ["invalid \xc3\x28 sequence"])
Left Cannot decode byte '\x28': Data.Text.Internal.Encoding.streamDecodeUtf8With: Invalid UTF-8 stream
>>> runExceptT $ L.foldM (transduceM utf8E (L.generalize L.list)) (map fromString ["incomplete \xe2"])
Left Cannot decode input: leftovers

Other transducers

newline :: Transducer Text Text () Source

Appends a newline at the end of the stream.

>>> L.fold (transduce newline L.list) (map T.pack ["without","newline"])
["without","newline","\n"]

stripStart :: Transducer Text Text () Source

Remove leading white space from a stream of Text.

>>> L.fold (transduce stripStart L.list) (map T.pack ["   ","", "   text "])
["text "]

stripEnd :: Transducer Text Text () Source

Remove trailing white space from a stream of Text.

BEWARE! This function naively accumulates in memory any arriving "blank blocks" of text until a non-blank block or end-of-stream arrives, and therefore it is potentially dangerous. Do not use with untrusted inputs.

>>> L.fold (transduce stripEnd L.list) (map T.pack [" ", " \n  text ", "   ", "" , " "])
[" "," \n  text"]

Splitters

lines :: Splitter Text Source

Splits a stream into lines, removing the newlines.

>>> L.fold (L.groups lines id L.list) (map T.pack ["line 1\n line 2\n"])
["line 1"," line 2"]
>>> L.fold (L.groups lines (transduce newline) L.list) (map T.pack ["line 1\n line 2\n"])
["line 1","\n"," line 2","\n"]