Safe Haskell | Trustworthy |
---|---|
Language | Haskell2010 |
This module provides pipes
utilities for "text streams", which are
streams of Text
chunks. The individual chunks are uniformly strict
, but
a Producer
can be converted to and from lazy Text
s, though this is generally
unwise. Where pipes IO replaces lazy IO, 'Producer Text m r' replaces lazy Text
.
An Handle
can be associated with a Producer
or Consumer
according as it is read or written to.
To stream to or from Handle
s, one can use fromHandle
or toHandle
. For
example, the following program copies a document from one file to another:
import Pipes import qualified Data.Text.Pipes as Text import System.IO main = withFile "inFile.txt" ReadMode $ \hIn -> withFile "outFile.txt" WriteMode $ \hOut -> runEffect $ Text.fromHandle hIn >-> Text.toHandle hOut
To stream from files, the following is perhaps more Prelude-like (note that it uses Pipes.Safe):
import Pipes import qualified Data.Text.Pipes as Text import Pipes.Safe main = runSafeT $ runEffect $ Text.readFile "inFile.txt" >-> Text.writeFile "outFile.txt"
You can stream to and from stdin
and stdout
using the predefined stdin
and stdout
pipes, as with the following "echo" program:
main = runEffect $ Text.stdin >-> Text.stdout
You can also translate pure lazy Text
s to and from pipes:
main = runEffect $ Text.fromLazy (TL.pack "Hello, world!\n") >-> Text.stdout
In addition, this module provides many functions equivalent to lazy
Text
functions so that you can transform or fold text streams. For
example, to stream only the first three lines of stdin
to stdout
you
might write:
import Pipes import qualified Pipes.Text as Text import qualified Pipes.Parse as Parse main = runEffect $ takeLines 3 Text.stdin >-> Text.stdout where takeLines n = Text.unlines . Parse.takeFree n . Text.lines
The above program will never bring more than one chunk of text (~ 32 KB) into memory, no matter how long the lines are.
Note that functions in this library are designed to operate on streams that
are insensitive to text boundaries. This means that they may freely split
text into smaller texts, discard empty texts. However, apart from the
special case of concatMap
, they will never concatenate texts in order
to provide strict upper bounds on memory usage -- with the single exception of concatMap
.
- fromLazy :: Monad m => Text -> Producer' Text m ()
- stdin :: MonadIO m => Producer Text m ()
- fromHandle :: MonadIO m => Handle -> Producer Text m ()
- readFile :: MonadSafe m => FilePath -> Producer Text m ()
- stdout :: MonadIO m => Consumer' Text m ()
- toHandle :: MonadIO m => Handle -> Consumer' Text m r
- writeFile :: MonadSafe m => FilePath -> Consumer' Text m ()
- map :: Monad m => (Char -> Char) -> Pipe Text Text m r
- concatMap :: Monad m => (Char -> Text) -> Pipe Text Text m r
- take :: (Monad m, Integral a) => a -> Pipe Text Text m ()
- drop :: (Monad m, Integral a) => a -> Pipe Text Text m r
- takeWhile :: Monad m => (Char -> Bool) -> Pipe Text Text m ()
- dropWhile :: Monad m => (Char -> Bool) -> Pipe Text Text m r
- filter :: Monad m => (Char -> Bool) -> Pipe Text Text m r
- scan :: Monad m => (Char -> Char -> Char) -> Char -> Pipe Text Text m r
- encodeUtf8 :: Monad m => Pipe Text ByteString m r
- pack :: Monad m => Pipe String Text m r
- unpack :: Monad m => Pipe Text String m r
- toCaseFold :: Monad m => Pipe Text Text m ()
- toLower :: Monad m => Pipe Text Text m ()
- toUpper :: Monad m => Pipe Text Text m ()
- stripStart :: Monad m => Pipe Text Text m r
- toLazy :: Producer Text Identity () -> Text
- toLazyM :: Monad m => Producer Text m () -> m Text
- foldChars :: Monad m => (x -> Char -> x) -> x -> (x -> r) -> Producer Text m () -> m r
- head :: Monad m => Producer Text m () -> m (Maybe Char)
- last :: Monad m => Producer Text m () -> m (Maybe Char)
- null :: Monad m => Producer Text m () -> m Bool
- length :: (Monad m, Num n) => Producer Text m () -> m n
- any :: Monad m => (Char -> Bool) -> Producer Text m () -> m Bool
- all :: Monad m => (Char -> Bool) -> Producer Text m () -> m Bool
- maximum :: Monad m => Producer Text m () -> m (Maybe Char)
- minimum :: Monad m => Producer Text m () -> m (Maybe Char)
- find :: Monad m => (Char -> Bool) -> Producer Text m () -> m (Maybe Char)
- index :: (Monad m, Integral a) => a -> Producer Text m () -> m (Maybe Char)
- count :: (Monad m, Num n) => Text -> Producer Text m () -> m n
- nextChar :: Monad m => Producer Text m r -> m (Either r (Char, Producer Text m r))
- drawChar :: Monad m => Parser Text m (Maybe Char)
- unDrawChar :: Monad m => Char -> Parser Text m ()
- peekChar :: Monad m => Parser Text m (Maybe Char)
- isEndOfChars :: Monad m => Parser Text m Bool
- splitAt :: (Monad m, Integral n) => n -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- span :: Monad m => (Char -> Bool) -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- break :: Monad m => (Char -> Bool) -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- groupBy :: Monad m => (Char -> Char -> Bool) -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- group :: Monad m => Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- word :: Monad m => Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- line :: Monad m => Lens' (Producer Text m r) (Producer Text m (Producer Text m r))
- decodeUtf8 :: Monad m => Lens' (Producer ByteString m r) (Producer Text m (Producer ByteString m r))
- codec :: Monad m => Codec -> Lens' (Producer ByteString m r) (Producer Text m (Producer ByteString m r))
- utf8 :: Codec
- utf16_le :: Codec
- utf16_be :: Codec
- utf32_le :: Codec
- utf32_be :: Codec
- decodeIso8859_1 :: Monad m => Producer ByteString m r -> Producer Text m (Producer ByteString m r)
- decodeAscii :: Monad m => Producer ByteString m r -> Producer Text m (Producer ByteString m r)
- encodeIso8859_1 :: Monad m => Producer Text m r -> Producer ByteString m (Producer Text m r)
- encodeAscii :: Monad m => Producer Text m r -> Producer ByteString m (Producer Text m r)
- chunksOf :: (Monad m, Integral n) => n -> Lens' (Producer Text m r) (FreeT (Producer Text m) m r)
- splitsWith :: Monad m => (Char -> Bool) -> Producer Text m r -> FreeT (Producer Text m) m r
- splits :: Monad m => Char -> Lens' (Producer Text m r) (FreeT (Producer Text m) m r)
- lines :: Monad m => Iso' (Producer Text m r) (FreeT (Producer Text m) m r)
- words :: Monad m => Iso' (Producer Text m r) (FreeT (Producer Text m) m r)
- intersperse :: Monad m => Char -> Producer Text m r -> Producer Text m r
- packChars :: Monad m => Iso' (Producer Char m x) (Producer Text m x)
- intercalate :: Monad m => Producer Text m () -> FreeT (Producer Text m) m r -> Producer Text m r
- unlines :: Monad m => FreeT (Producer Text m) m r -> Producer Text m r
- unwords :: Monad m => FreeT (Producer Text m) m r -> Producer Text m r
- data Decoding
- = Some Text ByteString (ByteString -> Decoding)
- | Other Text ByteString
- streamDecodeUtf8 :: ByteString -> Decoding
- decodeSomeUtf8 :: ByteString -> (Text, ByteString)
- data Codec = Codec {
- codecName :: Text
- codecEncode :: Text -> (ByteString, Maybe (TextException, Text))
- codecDecode :: ByteString -> Decoding
- data TextException
- module Data.ByteString
- module Data.Text
- module Data.Profunctor
- module Data.Word
- module Pipes.Parse
- module Pipes.Group
Producers
readFile :: MonadSafe m => FilePath -> Producer Text m () Source
Stream text from a file in the simple fashion of Data.Text.IO
>>>
runSafeT $ runEffect $ Text.readFile "hello.hs" >-> Text.map toUpper >-> hoist lift Text.stdout
MAIN = PUTSTRLN "HELLO WORLD"
Consumers
toHandle :: MonadIO m => Handle -> Consumer' Text m r Source
Convert a text stream into a Handle
Note: again, for best performance, where possible use
(for source (liftIO . hPutStr handle))
instead of (source >-> toHandle handle)
.
writeFile :: MonadSafe m => FilePath -> Consumer' Text m () Source
Stream text into a file. Uses pipes-safe
.
Pipes
map :: Monad m => (Char -> Char) -> Pipe Text Text m r Source
Apply a transformation to each Char
in the stream
concatMap :: Monad m => (Char -> Text) -> Pipe Text Text m r Source
Map a function over the characters of a text stream and concatenate the results
take :: (Monad m, Integral a) => a -> Pipe Text Text m () Source
(take n)
only allows n
individual characters to pass;
contrast Pipes.Prelude.take
which would let n
chunks pass.
drop :: (Monad m, Integral a) => a -> Pipe Text Text m r Source
(drop n)
drops the first n
characters
takeWhile :: Monad m => (Char -> Bool) -> Pipe Text Text m () Source
Take characters until they fail the predicate
dropWhile :: Monad m => (Char -> Bool) -> Pipe Text Text m r Source
Drop characters until they fail the predicate
filter :: Monad m => (Char -> Bool) -> Pipe Text Text m r Source
Only allows Char
s to pass if they satisfy the predicate
scan :: Monad m => (Char -> Char -> Char) -> Char -> Pipe Text Text m r Source
Strict left scan over the characters
encodeUtf8 :: Monad m => Pipe Text ByteString m r Source
Transform a Pipe of Text
into a Pipe of ByteString
s using UTF-8
encoding; encodeUtf8 = Pipes.Prelude.map TE.encodeUtf8
so more complex
encoding pipes can easily be constructed with the functions in Data.Text.Encoding
stripStart :: Monad m => Pipe Text Text m r Source
Remove leading white space from an incoming succession of Text
s
Folds
foldChars :: Monad m => (x -> Char -> x) -> x -> (x -> r) -> Producer Text m () -> m r Source
Reduce the text stream using a strict left fold over characters
length :: (Monad m, Num n) => Producer Text m () -> m n Source
Count the number of characters in the stream
maximum :: Monad m => Producer Text m () -> m (Maybe Char) Source
Return the maximum Char
within a text stream
minimum :: Monad m => Producer Text m () -> m (Maybe Char) Source
Return the minimum Char
within a text stream (surely very useful!)
find :: Monad m => (Char -> Bool) -> Producer Text m () -> m (Maybe Char) Source
Find the first element in the stream that matches the predicate
index :: (Monad m, Integral a) => a -> Producer Text m () -> m (Maybe Char) Source
Index into a text stream
count :: (Monad m, Num n) => Text -> Producer Text m () -> m n Source
Store a tally of how many segments match the given Text
Primitive Character Parsers
The following parsing utilities are single-character analogs of the ones found
pipes-parse
.
peekChar :: Monad m => Parser Text m (Maybe Char) Source
peekChar
checks the first Char
in the stream, but uses unDrawChar
to
push the Char
back
peekChar = do x <- drawChar case x of Left _ -> return () Right c -> unDrawChar c return x
isEndOfChars :: Monad m => Parser Text m Bool Source
Check if the underlying Producer
has no more characters
Note that this will skip over empty Text
chunks, unlike
isEndOfInput
from pipes-parse
, which would consider
an empty Text
a valid bit of input.
isEndOfChars = liftM isLeft peekChar
Parsing Lenses
splitAt :: (Monad m, Integral n) => n -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r)) Source
Splits a Producer
after the given number of characters
span :: Monad m => (Char -> Bool) -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r)) Source
Split a text stream in two, where the first text stream is the longest consecutive group of text that satisfy the predicate
break :: Monad m => (Char -> Bool) -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r)) Source
Split a text stream in two, where the first text stream is the longest consecutive group of characters that don't satisfy the predicate
groupBy :: Monad m => (Char -> Char -> Bool) -> Lens' (Producer Text m r) (Producer Text m (Producer Text m r)) Source
Improper lens that splits after the first group of equivalent Chars, as defined by the given equivalence relation
group :: Monad m => Lens' (Producer Text m r) (Producer Text m (Producer Text m r)) Source
Improper lens that splits after the first succession of identical Char
s
Decoding Lenses
decodeUtf8 :: Monad m => Lens' (Producer ByteString m r) (Producer Text m (Producer ByteString m r)) Source
An improper lens into a stream of ByteString
expected to be UTF-8 encoded; the associated
stream of Text ends by returning a stream of ByteStrings beginning at the point of failure.
codec :: Monad m => Codec -> Lens' (Producer ByteString m r) (Producer Text m (Producer ByteString m r)) Source
Use a Codec
as a pipes-style Lens
into a byte stream; the available Codec
s are
utf8
, utf16_le
, utf16_be
, utf32_le
, utf32_be
. The Codec
concept and the
individual Codec
definitions follow the enumerator and conduit libraries.
Utf8 is handled differently in this library -- without the use of unsafePerformIO
&co
to catch Text
exceptions; but the same 'mypipe ^. codec utf8' interface can be used.
'mypipe ^. decodeUtf8' should be the same, but has a somewhat more direct and thus perhaps
better implementation.
Codecs
Other Decoding/Encoding Functions
decodeIso8859_1 :: Monad m => Producer ByteString m r -> Producer Text m (Producer ByteString m r) Source
Reduce a byte stream to a corresponding stream of ascii chars, returning the
unused ByteString
upon hitting the rare un-latinizable byte.
decodeAscii :: Monad m => Producer ByteString m r -> Producer Text m (Producer ByteString m r) Source
Reduce a byte stream to a corresponding stream of ascii chars, returning the
unused ByteString
upon hitting an un-ascii byte.
encodeIso8859_1 :: Monad m => Producer Text m r -> Producer ByteString m (Producer Text m r) Source
encodeAscii :: Monad m => Producer Text m r -> Producer ByteString m (Producer Text m r) Source
ascii and latin encodings only represent a small fragment of Text
; thus we cannot
use the pipes Lens
style to work with them. Rather we simply define functions
each way.
encodeAscii
: Reduce as much of your stream of Text
actually is ascii to a byte stream,
returning the rest of the Text
at the first non-ascii Char
FreeT Splitters
chunksOf :: (Monad m, Integral n) => n -> Lens' (Producer Text m r) (FreeT (Producer Text m) m r) Source
Split a text stream into FreeT
-delimited text streams of fixed size
splitsWith :: Monad m => (Char -> Bool) -> Producer Text m r -> FreeT (Producer Text m) m r Source
Split a text stream into sub-streams delimited by characters that satisfy the predicate
splits :: Monad m => Char -> Lens' (Producer Text m r) (FreeT (Producer Text m) m r) Source
Split a text stream using the given Char
as the delimiter
lines :: Monad m => Iso' (Producer Text m r) (FreeT (Producer Text m) m r) Source
Split a text stream into FreeT
-delimited lines
words :: Monad m => Iso' (Producer Text m r) (FreeT (Producer Text m) m r) Source
Split a text stream into FreeT
-delimited words
Transformations
packChars :: Monad m => Iso' (Producer Char m x) (Producer Text m x) Source
Improper isomorphism between a Producer
of ByteString
s and Word8
s
Joiners
intercalate :: Monad m => Producer Text m () -> FreeT (Producer Text m) m r -> Producer Text m r Source
intercalate
concatenates the FreeT
-delimited text streams after
interspersing a text stream in between them
unlines :: Monad m => FreeT (Producer Text m) m r -> Producer Text m r Source
Join FreeT
-delimited lines into a text stream
unwords :: Monad m => FreeT (Producer Text m) m r -> Producer Text m r Source
Join FreeT
-delimited words into a text stream
Re-exports
Data.Text
re-exports the Text
type.
Pipes.Parse
re-exports input
, concat
, FreeT
(the type) and the Parse
synonym.
decodeSomeUtf8 :: ByteString -> (Text, ByteString) Source
A specific character encoding.
Codec | |
|
data TextException Source
module Data.ByteString
module Data.Text
module Data.Profunctor
module Data.Word
module Pipes.Parse
module Pipes.Group