io-streams-1.5.1.0: Simple, composable, and easy-to-use stream I/O

Safe HaskellSafe
LanguageHaskell2010

System.IO.Streams.Tutorial

Contents

Synopsis

    Introduction

    The io-streams package defines two "smart handles" for stream processing:

    The InputStream type implements all the core operations we expect for a read-only handle. We consume values using read, which returns a Nothing when the resource is done:

    read :: InputStream c -> IO (Maybe c)
    

    ... and we can push back values using unRead:

    unRead :: c -> InputStream c -> IO ()
    

    The OutputStream type implements the write operation which feeds it output, supplying Nothing to signal resource exhaustion:

    write :: Maybe c -> OutputStream c -> IO ()
    

    These streams slightly resemble Haskell Handles, but support a wider range of sources and sinks. For example, you can convert an ordinary list to an InputStream source and interact with it using the handle-based API:

    ghci> import qualified System.IO.Streams as S
    ghci> listHandle <- S.fromList [1, 2]
    ghci> S.read listHandle
    Just 1
    ghci> S.read listHandle
    Just 2
    ghci> S.read listHandle
    Nothing
    

    Additionally, IO streams come with a library of stream transformations that preserve their handle-like API. For example, you can map a function over an InputStream, which generates a new handle to the same stream that returns transformed values:

    ghci> oldHandle <- S.fromList [1, 2, 3]
    ghci> newHandle <- S.mapM (\x -> return (x * 10)) oldHandle
    ghci> S.read newHandle
    10
    ghci> -- We can still view the stream through the old handle
    ghci> S.read oldHandle
    2
    ghci> -- ... and switch back again
    ghci> S.read newHandle
    30
    

    IO streams focus on preserving the convention of traditional handles while offering a wider library of stream-processing utilities.

    Build Input Streams

    The io-streams library provides a simple interface for creating your own InputStreams and OutputStreams.

    You can build an InputStream from any IO action that generates output, as long as it wraps results in Just and uses Nothing to signal EOF:

    makeInputStream :: IO (Maybe a) -> IO (InputStream a)
    

    As an example, let's wrap an ordinary read-only Handle in an InputStream:

    import Data.ByteString (ByteString)
    import qualified Data.ByteString as S
    import System.IO.Streams (InputStream)
    import qualified System.IO.Streams as Streams
    import System.IO (Handle, hFlush)
    
    bUFSIZ = 32752
    
    upgradeReadOnlyHandle :: Handle -> IO (InputStream ByteString)
    upgradeReadOnlyHandle h = Streams.makeInputStream f
      where
        f = do
            x <- S.hGetSome h bUFSIZ
            return $! if S.null x then Nothing else Just x
    

    We didn't even really need to write the upgradeReadOnlyHandle function, because System.IO.Streams.Handle already provides one that uses the exact same implementation given above:

    handleToInputStream :: Handle -> IO (InputStream ByteString)
    

    Build Output Streams

    Similarly, you can build any OutputStream from an IO action that accepts input, as long as it interprets Just as more input and Nothing as EOF:

    makeOutputStream :: (Maybe a -> IO ()) -> IO (OutputStream a)
    

    A simple OutputStream might wrap putStrLn for ByteStrings:

    import Data.ByteString (ByteString)
    import qualified Data.ByteString as S
    import System.IO.Streams (OutputStream)
    import qualified System.IO.Streams as Streams
    
    writeConsole :: IO (OutputStream ByteString)
    writeConsole = Streams.makeOutputStream $ \m -> case m of
        Just bs -> S.putStrLn bs
        Nothing -> return ()
    

    The Just wraps more incoming data, whereas Nothing indicates the data is exhausted. In principle, you can feed OutputStreams more input after writing a Nothing to them, but IO streams only guarantee a well-defined behavior up to the first Nothing. After receiving the first Nothing, an OutputStream could respond to additional input by:

    • Using the input
    • Ignoring the input
    • Throwing an exception

    Ideally, you should adhere to well-defined behavior and ensure that after you write a Nothing to an OutputStream, you don't write anything else.

    Connect Streams

    io-streams provides two ways to connect an InputStream and OutputStream:

    connect :: InputStream a -> OutputStream a -> IO ()
    supply  :: InputStream a -> OutputStream a -> IO ()
    

    connect feeds the OutputStream exclusively with the given InputStream and passes along the end-of-stream notification to the OutputStream.

    supply feeds the OutputStream non-exclusively with the given InputStream and does not pass along the end-of-stream notification to the OutputStream.

    You can combine both supply and connect to feed multiple InputStreams into a single OutputStream:

    import qualified System.IO.Streams as Streams
    import System.IO (IOMode(WriteMode))
    
    main = do
       Streams.withFileAsOutput "out.txt" WriteMode $ \outStream ->
       Streams.withFileAsInput  "in1.txt" $ \inStream1 ->
       Streams.withFileAsInput  "in2.txt" $ \inStream2 ->
       Streams.withFileAsInput  "in3.txt" $ \inStream3 ->
       Streams.supply  inStream1 outStream
       Streams.supply  inStream2 outStream
       Streams.connect inStream3 outStream
    

    The final connect seals the OutputStream when the final InputStream terminates.

    Keep in mind that you do not need to use connect or supply at all: io-streams mainly provides them for user convenience. You can always build your own abstractions on top of the read and write operations.

    Transform Streams

    When we build or use IO streams we can tap into all the stream-processing features the io-streams library provides. For example, we can decompress any InputStream of ByteStrings:

    import Control.Monad ((>=>))
    import Data.ByteString (ByteString)
    import System.IO (Handle)
    import System.IO.Streams (InputStream, OutputStream)
    import qualified System.IO.Streams as Streams
    import qualified System.IO.Streams.File as Streams
    
    unzipHandle :: Handle -> IO (InputStream ByteString)
    unzipHandle = Streams.handleToInputStream >=> Streams.decompress
    

    ... or we can guard it against a denial-of-service attack:

    protectHandle :: Handle -> IO (InputStream ByteString)
    protectHandle =
        Streams.handleToInputStream >=> Streams.throwIfProducesMoreThan 1000000
    

    io-streams provides many useful functions such as these in its standard library and you take advantage of them by defining IO streams that wrap your resources.

    Resource and Exception Safety

    IO streams use standard Haskell idioms for resource safety. Since all operations occur in the IO monad, you can use catch, bracket, or various "with..." functions to guard any read or write without any special considerations:

    import qualified Data.ByteString as S
    import System.IO
    import System.IO.Streams (InputStream, OutputStream)
    import qualified System.IO.Streams as Streams
    import qualified System.IO.Streams.File as Streams
    
    main =
        withFile "test.txt" ReadMode $ \handle -> do
            stream <- Streams.handleToInputStream handle
            mBytes <- Streams.read stream
            case mBytes of
                Just bytes -> S.putStrLn bytes
                Nothing    -> putStrLn "EOF"
    

    However, you can also simplify the above example by using the convenience function withFileAsInput from System.IO.Streams.File:

    withFileAsInput
     :: FilePath -> (InputStream ByteString -> IO a) -> IO a
    

    Pushback

    All InputStreams support pushback, which simplifies many types of operations. For example, we can peek at an InputStream by combining read and unRead:

    peek :: InputStream c -> IO (Maybe c)
    peek s = do
        x <- Streams.read s
        case x of
            Nothing -> return ()
            Just c  -> Streams.unRead c s
        return x
    

    ... although System.IO.Streams already exports the above function.

    InputStreams can customize pushback behavior to support more sophisticated support for pushback. For example, if you protect a stream using throwIfProducesMoreThan and unRead input, it will subtract the unread input from the total byte count. However, these extra features will not interfere with the basic pushback contract, given by the following law:

    unRead c stream >> read stream == return (Just c)
    

    When you build an InputStream using makeInputStream, it supplies the default pushback behavior which just saves the input for the next read call. More advanced users can use System.IO.Streams.Internal to customize their own pushback routines.

    Thread Safety

    IO stream operations are not thread-safe by default for performance reasons. However, you can transform an existing IO stream into a thread-safe one using the provided locking functions:

    lockingInputStream  :: InputStream  a -> IO (InputStream  a)
    lockingOutputStream :: OutputStream a -> IO (OutputStream a)
    

    These functions do not prevent access to the previous IO stream, so you must take care to not save the reference to the previous stream.

    Examples

    The following examples show how to use the standard library to implement traditional command-line utilities:

    {-# LANGUAGE OverloadedStrings #-}
    
    import Control.Monad ((>=>), join)
    import qualified Data.ByteString.Char8 as S
    import Data.Int (Int64)
    import Data.Monoid ((<>))
    import System.IO.Streams (InputStream)
    import qualified System.IO.Streams as Streams
    import System.IO
    import Prelude hiding (head)
    
    cat :: FilePath -> IO ()
    cat file = withFile file ReadMode $ \h -> do
        is <- Streams.handleToInputStream h
        Streams.connect is Streams.stdout
    
    grep :: S.ByteString -> FilePath -> IO ()
    grep pattern file = withFile file ReadMode $ \h -> do
        is <- Streams.handleToInputStream h >>=
              Streams.lines                 >>=
              Streams.filter (S.isInfixOf pattern)
        os <- Streams.unlines Streams.stdout
        Streams.connect is os
    
    data Option = Bytes | Words | Lines
    
    len :: InputStream a -> IO Int64
    len = Streams.fold (\n _ -> n + 1) 0
    
    wc :: Option -> FilePath -> IO ()
    wc opt file = withFile file ReadMode $
        Streams.handleToInputStream >=> count >=> print
      where
        count = case opt of
            Bytes -> \is -> do
                (is', cnt) <- Streams.countInput is
                Streams.skipToEof is'
                cnt
            Words -> Streams.words >=> len
            Lines -> Streams.lines >=> len
    
    nl :: FilePath -> IO ()
    nl file = withFile file ReadMode $ \h -> do
        nats <- Streams.fromList [1..]
        ls   <- Streams.handleToInputStream h >>= Streams.lines
        is   <- Streams.zipWith
                    (\n bs -> S.pack (show n) <> " " <> bs)
                    nats
                    ls
        os   <- Streams.unlines Streams.stdout
        Streams.connect is os
    
    head :: Int64 -> FilePath -> IO ()
    head n file = withFile file ReadMode $ \h -> do
        is <- Streams.handleToInputStream h >>= Streams.lines >>= Streams.take n
        os <- Streams.unlines Streams.stdout
        Streams.connect is os
    
    paste :: FilePath -> FilePath -> IO ()
    paste file1 file2 =
        withFile file1 ReadMode $ \h1 ->
        withFile file2 ReadMode $ \h2 -> do
        is1 <- Streams.handleToInputStream h1 >>= Streams.lines
        is2 <- Streams.handleToInputStream h2 >>= Streams.lines
        isT <- Streams.zipWith (\l1 l2 -> l1 <> "\t" <> l2) is1 is2
        os  <- Streams.unlines Streams.stdout
        Streams.connect isT os
    
    yes :: IO ()
    yes = do
        is <- Streams.fromList (repeat "y")
        os <- Streams.unlines Streams.stdout
        Streams.connect is os