| Safe Haskell | Trustworthy |
|---|---|
| Language | Haskell2010 |
Pipes.Text
Contents
Description
This module provides pipes utilities for "text streams", which are
streams of Text chunks. The individual chunks are uniformly strict, but
a Producer can be converted to and from lazy Texts, though this is generally
unwise. Where pipes IO replaces lazy IO, 'Producer Text m r' replaces lazy Text.
An Handle can be associated with a Producer or Consumer according as it is read or written to.
To stream to or from Handles, one can use fromHandle or toHandle. For
example, the following program copies a document from one file to another:
import Pipes
import qualified Data.Text.Pipes as Text
import System.IO
main =
withFile "inFile.txt" ReadMode $ \hIn ->
withFile "outFile.txt" WriteMode $ \hOut ->
runEffect $ Text.fromHandle hIn >-> Text.toHandle hOutTo stream from files, the following is perhaps more Prelude-like (note that it uses Pipes.Safe):
import Pipes import qualified Data.Text.Pipes as Text import Pipes.Safe main = runSafeT $ runEffect $ Text.readFile "inFile.txt" >-> Text.writeFile "outFile.txt"
You can stream to and from stdin and stdout using the predefined stdin
and stdout pipes, as with the following "echo" program:
main = runEffect $ Text.stdin >-> Text.stdout
You can also translate pure lazy Texts to and from pipes:
main = runEffect $ Text.fromLazy (TL.pack "Hello, world!\n") >-> Text.stdout
In addition, this module provides many functions equivalent to lazy
Text functions so that you can transform or fold text streams. For
example, to stream only the first three lines of stdin to stdout you
might write:
import Pipes
import qualified Pipes.Text as Text
import qualified Pipes.Parse as Parse
main = runEffect $ takeLines 3 Text.stdin >-> Text.stdout
where
takeLines n = Text.unlines . Parse.takeFree n . Text.linesThe above program will never bring more than one chunk of text (~ 32 KB) into memory, no matter how long the lines are.
Note that functions in this library are designed to operate on streams that
are insensitive to text boundaries. This means that they may freely split
text into smaller texts, discard empty texts. However, apart from the
special case of concatMap, they will never concatenate texts in order
to provide strict upper bounds on memory usage -- with the single exception of concatMap.
- fromLazy :: Monad m => Text -> Producer' Text m ()
- stdin :: MonadIO m => Producer Text m ()
- fromHandle :: MonadIO m => Handle -> Producer Text m ()
- readFile :: MonadSafe m => FilePath -> Producer Text m ()
- stdout :: MonadIO m => Consumer' Text m ()
- toHandle :: MonadIO m => Handle -> Consumer' Text m r
- writeFile :: MonadSafe m => FilePath -> Consumer' Text m ()
- map :: Monad m => (Char -> Char) -> Pipe Text Text m r
- concatMap :: Monad m => (Char -> Text) -> Pipe Text Text m r
- take :: (Monad m, Integral a) => a -> Pipe Text Text m ()
- drop :: (Monad m, Integral a) => a -> Pipe Text Text m r
- takeWhile :: Monad m => (Char -> Bool) -> Pipe Text Text m ()
- dropWhile :: Monad m => (Char -> Bool) -> Pipe Text Text m r
- filter :: Monad m => (Char -> Bool) -> Pipe Text Text m r
- scan :: Monad m => (Char -> Char -> Char) -> Char -> Pipe Text Text m r
- encodeUtf8 :: Monad m => Pipe Text ByteString m r
- pack :: Monad m => Pipe String Text m r
- unpack :: Monad m => Pipe Text String m r
- toCaseFold :: Monad m => Pipe Text Text m ()
- toLower :: Monad m => Pipe Text Text m ()
- toUpper :: Monad m => Pipe Text Text m ()
- stripStart :: Monad m => Pipe Text Text m r
- toLazy :: Producer Text Identity () -> Text
- toLazyM :: Monad m => Producer Text m () -> m Text
- foldChars :: Monad m => (x -> Char -> x) -> x -> (x -> r) -> Producer Text m () -> m r
- head :: Monad m => Producer Text m () -> m (Maybe Char)
- last :: Monad m => Producer Text m () -> m (Maybe Char)
- null :: Monad m => Producer Text m () -> m Bool
- length :: (Monad m, Num n) => Producer Text m () -> m n
- any :: Monad m => (Char -> Bool) -> Producer Text m () -> m Bool
- all :: Monad m => (Char -> Bool) -> Producer Text m () -> m Bool
- maximum :: Monad m => Producer Text m () -> m (Maybe Char)
- minimum :: Monad m => Producer Text m () -> m (Maybe Char)
- find :: Monad m => (Char -> Bool) -> Producer Text m () -> m (Maybe Char)
- index :: (Monad m, Integral a) => a -> Producer Text m () -> m (Maybe Char)
- count :: (Monad m, Num n) => Text -> Producer Text m () -> m n
- nextChar :: Monad m => Producer Text m r -> m (Either r (Char, Producer Text m r))
- drawChar :: Monad m => Parser Text m (Maybe Char)
- unDrawChar :: Monad m => Char -> Parser Text m ()
- peekChar :: Monad m => Parser Text m (Maybe Char)
- isEndOfChars :: Monad m => Parser Text m Bool
- splitAt :: (Monad m, Integral n) => n -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- span :: Monad m => (Char -> Bool) -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- break :: Monad m => (Char -> Bool) -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- groupBy :: Monad m => (Char -> Char -> Bool) -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- group :: Monad m => Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- word :: Monad m => Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- line :: Monad m => Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- decodeUtf8 :: Monad m => Lens' (Producer ByteString m r) (Producer Text m (Producer ByteString m r))
- codec :: Monad m => Codec -> Lens' (Producer ByteString m r) (Producer Text m (Producer ByteString m r))
- utf8 :: Codec
- utf16_le :: Codec
- utf16_be :: Codec
- utf32_le :: Codec
- utf32_be :: Codec
- decodeIso8859_1 :: Monad m => Producer ByteString m r -> Producer Text m (Producer ByteString m r)
- decodeAscii :: Monad m => Producer ByteString m r -> Producer Text m (Producer ByteString m r)
- encodeIso8859_1 :: Monad m => Producer Text m r -> Producer ByteString m (Producer Text m r)
- encodeAscii :: Monad m => Producer Text m r -> Producer ByteString m (Producer Text m r)
- chunksOf :: (Monad m, Integral n) => n -> Lens' (Producer Text m r) (FreeT (Producer Text m) m r)
- splitsWith :: Monad m => (Char -> Bool) -> Producer Text m r -> FreeT (Producer Text m) m r
- splits :: Monad m => Char -> Lens' (Producer Text m r) (FreeT (Producer Text m) m r)
- lines :: Monad m => Iso' (Producer Text m r) (FreeT (Producer Text m) m r)
- words :: Monad m => Iso' (Producer Text m r) (FreeT (Producer Text m) m r)
- intersperse :: Monad m => Char -> Producer Text m r -> Producer Text m r
- packChars :: Monad m => Iso' (Producer Char m x) (Producer Text m x)
- intercalate :: Monad m => Producer Text m () -> FreeT (Producer Text m) m r -> Producer Text m r
- unlines :: Monad m => FreeT (Producer Text m) m r -> Producer Text m r
- unwords :: Monad m => FreeT (Producer Text m) m r -> Producer Text m r
- data Decoding
- = Some Text ByteString (ByteString -> Decoding)
- | Other Text ByteString
- streamDecodeUtf8 :: ByteString -> Decoding
- decodeSomeUtf8 :: ByteString -> (Text, ByteString)
- data Codec = Codec {
- codecName :: Text
- codecEncode :: Text -> (ByteString, Maybe (TextException, Text))
- codecDecode :: ByteString -> Decoding
- data TextException
- module Data.ByteString
- module Data.Text
- module Data.Profunctor
- module Data.Word
- module Pipes.Parse
- module Pipes.Group
Producers
readFile :: MonadSafe m => FilePath -> Producer Text m () Source
Stream text from a file in the simple fashion of Data.Text.IO
>>>runSafeT $ runEffect $ Text.readFile "hello.hs" >-> Text.map toUpper >-> hoist lift Text.stdoutMAIN = PUTSTRLN "HELLO WORLD"
Consumers
toHandle :: MonadIO m => Handle -> Consumer' Text m r Source
Convert a text stream into a Handle
Note: again, for best performance, where possible use
(for source (liftIO . hPutStr handle)) instead of (source >-> toHandle handle).
writeFile :: MonadSafe m => FilePath -> Consumer' Text m () Source
Stream text into a file. Uses pipes-safe.
Pipes
map :: Monad m => (Char -> Char) -> Pipe Text Text m r Source
Apply a transformation to each Char in the stream
concatMap :: Monad m => (Char -> Text) -> Pipe Text Text m r Source
Map a function over the characters of a text stream and concatenate the results
take :: (Monad m, Integral a) => a -> Pipe Text Text m () Source
(take n) only allows n individual characters to pass;
contrast Pipes.Prelude.take which would let n chunks pass.
drop :: (Monad m, Integral a) => a -> Pipe Text Text m r Source
(drop n) drops the first n characters
takeWhile :: Monad m => (Char -> Bool) -> Pipe Text Text m () Source
Take characters until they fail the predicate
dropWhile :: Monad m => (Char -> Bool) -> Pipe Text Text m r Source
Drop characters until they fail the predicate
filter :: Monad m => (Char -> Bool) -> Pipe Text Text m r Source
Only allows Chars to pass if they satisfy the predicate
scan :: Monad m => (Char -> Char -> Char) -> Char -> Pipe Text Text m r Source
Strict left scan over the characters
encodeUtf8 :: Monad m => Pipe Text ByteString m r Source
Transform a Pipe of Text into a Pipe of ByteStrings using UTF-8
encoding; encodeUtf8 = Pipes.Prelude.map TE.encodeUtf8 so more complex
encoding pipes can easily be constructed with the functions in Data.Text.Encoding
stripStart :: Monad m => Pipe Text Text m r Source
Remove leading white space from an incoming succession of Texts
Folds
foldChars :: Monad m => (x -> Char -> x) -> x -> (x -> r) -> Producer Text m () -> m r Source
Reduce the text stream using a strict left fold over characters
length :: (Monad m, Num n) => Producer Text m () -> m n Source
Count the number of characters in the stream
maximum :: Monad m => Producer Text m () -> m (Maybe Char) Source
Return the maximum Char within a text stream
minimum :: Monad m => Producer Text m () -> m (Maybe Char) Source
Return the minimum Char within a text stream (surely very useful!)
find :: Monad m => (Char -> Bool) -> Producer Text m () -> m (Maybe Char) Source
Find the first element in the stream that matches the predicate
index :: (Monad m, Integral a) => a -> Producer Text m () -> m (Maybe Char) Source
Index into a text stream
count :: (Monad m, Num n) => Text -> Producer Text m () -> m n Source
Store a tally of how many segments match the given Text
Primitive Character Parsers
The following parsing utilities are single-character analogs of the ones found
pipes-parse.
peekChar :: Monad m => Parser Text m (Maybe Char) Source
peekChar checks the first Char in the stream, but uses unDrawChar to
push the Char back
peekChar = do
x <- drawChar
case x of
Left _ -> return ()
Right c -> unDrawChar c
return xisEndOfChars :: Monad m => Parser Text m Bool Source
Check if the underlying Producer has no more characters
Note that this will skip over empty Text chunks, unlike
isEndOfInput from pipes-parse, which would consider
an empty Text a valid bit of input.
isEndOfChars = liftM isLeft peekChar
Parsing Lenses
splitAt :: (Monad m, Integral n) => n -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r)) Source
Splits a Producer after the given number of characters
span :: Monad m => (Char -> Bool) -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r)) Source
Split a text stream in two, where the first text stream is the longest consecutive group of text that satisfy the predicate
break :: Monad m => (Char -> Bool) -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r)) Source
Split a text stream in two, where the first text stream is the longest consecutive group of characters that don't satisfy the predicate
groupBy :: Monad m => (Char -> Char -> Bool) -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r)) Source
Improper lens that splits after the first group of equivalent Chars, as defined by the given equivalence relation
group :: Monad m => Lens' (Producer Text m r) (Producer Text m (Producer Text m r)) Source
Improper lens that splits after the first succession of identical Char s
Decoding Lenses
decodeUtf8 :: Monad m => Lens' (Producer ByteString m r) (Producer Text m (Producer ByteString m r)) Source
An improper lens into a stream of ByteString expected to be UTF-8 encoded; the associated
stream of Text ends by returning a stream of ByteStrings beginning at the point of failure.
codec :: Monad m => Codec -> Lens' (Producer ByteString m r) (Producer Text m (Producer ByteString m r)) Source
Use a Codec as a pipes-style Lens into a byte stream; the available Codec s are
utf8, utf16_le, utf16_be, utf32_le, utf32_be . The Codec concept and the
individual Codec definitions follow the enumerator and conduit libraries.
Utf8 is handled differently in this library -- without the use of unsafePerformIO &co
to catch Text exceptions; but the same 'mypipe ^. codec utf8' interface can be used.
'mypipe ^. decodeUtf8' should be the same, but has a somewhat more direct and thus perhaps
better implementation.
Codecs
Other Decoding/Encoding Functions
decodeIso8859_1 :: Monad m => Producer ByteString m r -> Producer Text m (Producer ByteString m r) Source
Reduce a byte stream to a corresponding stream of ascii chars, returning the
unused ByteString upon hitting the rare un-latinizable byte.
decodeAscii :: Monad m => Producer ByteString m r -> Producer Text m (Producer ByteString m r) Source
Reduce a byte stream to a corresponding stream of ascii chars, returning the
unused ByteString upon hitting an un-ascii byte.
encodeIso8859_1 :: Monad m => Producer Text m r -> Producer ByteString m (Producer Text m r) Source
encodeAscii :: Monad m => Producer Text m r -> Producer ByteString m (Producer Text m r) Source
ascii and latin encodings only represent a small fragment of Text; thus we cannot
use the pipes Lens style to work with them. Rather we simply define functions
each way.
encodeAscii : Reduce as much of your stream of Text actually is ascii to a byte stream,
returning the rest of the Text at the first non-ascii Char
FreeT Splitters
chunksOf :: (Monad m, Integral n) => n -> Lens' (Producer Text m r) (FreeT (Producer Text m) m r) Source
Split a text stream into FreeT-delimited text streams of fixed size
splitsWith :: Monad m => (Char -> Bool) -> Producer Text m r -> FreeT (Producer Text m) m r Source
Split a text stream into sub-streams delimited by characters that satisfy the predicate
splits :: Monad m => Char -> Lens' (Producer Text m r) (FreeT (Producer Text m) m r) Source
Split a text stream using the given Char as the delimiter
lines :: Monad m => Iso' (Producer Text m r) (FreeT (Producer Text m) m r) Source
Split a text stream into FreeT-delimited lines
words :: Monad m => Iso' (Producer Text m r) (FreeT (Producer Text m) m r) Source
Split a text stream into FreeT-delimited words
Transformations
packChars :: Monad m => Iso' (Producer Char m x) (Producer Text m x) Source
Improper isomorphism between a Producer of ByteStrings and Word8s
Joiners
intercalate :: Monad m => Producer Text m () -> FreeT (Producer Text m) m r -> Producer Text m r Source
intercalate concatenates the FreeT-delimited text streams after
interspersing a text stream in between them
unlines :: Monad m => FreeT (Producer Text m) m r -> Producer Text m r Source
Join FreeT-delimited lines into a text stream
unwords :: Monad m => FreeT (Producer Text m) m r -> Producer Text m r Source
Join FreeT-delimited words into a text stream
Re-exports
Data.Text re-exports the Text type.
Pipes.Parse re-exports input, concat, FreeT (the type) and the Parse synonym.
Constructors
| Some Text ByteString (ByteString -> Decoding) | |
| Other Text ByteString |
decodeSomeUtf8 :: ByteString -> (Text, ByteString) Source
A specific character encoding.
Constructors
| Codec | |
Fields
| |
data TextException Source
Constructors
| DecodeException Codec Word8 | |
| EncodeException Codec Char | |
| LengthExceeded Int | |
| TextException SomeException |
Instances
module Data.ByteString
module Data.Text
module Data.Profunctor
module Data.Word
module Pipes.Parse
module Pipes.Group