iterIO-0.2: Iteratee-based IO with pipe operators

Data.IterIO.Iter

Contents

Synopsis

Base types

class Monoid t => ChunkData t whereSource

ChunkData is the class of data types that can be output by an enumerator and iterated on with an iteratee. A ChunkData type must be a Monoid, but must additionally provide a predicate, null, for testing whether an object is equal to mempty. Feeding a null chunk to an iteratee followed by any other chunk should have the same effect as just feeding the second chunk. To simplify debugging, there is an additional requirement that ChunkData be convertable to a String with the chunkShow method.

Note that because the Prelude contains a function null for lists, you may wish to include the import:

 import Prelude hiding (null)

Methods

null :: t -> BoolSource

chunkShow :: t -> StringSource

Instances

ChunkData () 
ChunkData ByteString 
ChunkData ByteString 
Show a => ChunkData [a] 
ChunkData t => ChunkData (Chunk t)

A Chunk is null when its data is null and its EOF flag is False.

data Chunk t Source

Chunk is a wrapper around a ChunkData type that also includes an EOF flag that is True if the data is followed by an end-of-file condition. An Iter that receives a Chunk with EOF True must return a result (or failure); it is an error to demand more data (return IterF) after an EOF.

Constructors

Chunk !t !Bool 

Instances

Functor Chunk 
Typeable1 Chunk 
Eq t => Eq (Chunk t) 
ChunkData t => Show (Chunk t) 
ChunkData t => Monoid (Chunk t) 
ChunkData t => ChunkData (Chunk t)

A Chunk is null when its data is null and its EOF flag is False.

chunk :: t -> Chunk tSource

Constructor function that builds a chunk containing data and a False EOF flag.

chunkEOF :: Monoid t => Chunk tSource

An chunk with mempty data and the EOF flag True.

newtype Iter t m a Source

The basic Iteratee type is Iter t m a, where t is the type of input (in class ChunkData), m is a monad in which the iteratee may execute actions (using the MonadTrans lift method), and a is the result type of the iteratee.

Internally, an Iter is a function from an input Chunk to a result of type IterR.

Constructors

Iter 

Fields

runIter :: Chunk t -> IterR t m a
 

Instances

(Error e, MonadError e m, ChunkData t) => MonadError e (Iter t m) 
(MonadReader r m, ChunkData t) => MonadReader r (Iter t m) 
(MonadState s m, ChunkData t) => MonadState s (Iter t m) 
(Monoid w, MonadWriter w m, ChunkData t) => MonadWriter w (Iter t m) 
MonadTrans (Iter t) 
Monad m => Monad (Iter t m) 
Monad m => Functor (Iter t m) 
(Typeable t, Typeable1 m) => Typeable1 (Iter t m) 
MonadIO m => MonadFix (Iter t m) 
(ChunkData t, Monad m) => MonadPlus (Iter t m) 
Monad m => Applicative (Iter t m) 
MonadIO m => MonadIO (Iter t m)

The Iter instance of MonadIO handles errors specially. If the lifted operation throws an exception, liftIO catches the exception and returns it as an IterFail failure. If the exception is an IOError satisfying isEOFError, then the exception is wrapped in the IterEOFErr constructor; otherwise, it is wrapped in IterException otherwise. This approach allows efficient testing for EOF errors without the need to invoke the expensive cast or fromException operations. (Yes liftIO uses these expensive operations, but Iters that invoke throwEOFI do not.)

One consequence of this exception handling is that with Iter, unlike with most monad transformers, liftIO is not equivalent to some number of nested calls to lift. See the documentation of .|$ for an example.

(ChunkData t, MonadCont m) => MonadCont (Iter t m) 
(Typeable t, Typeable1 m, Typeable a) => Typeable (Iter t m a) 

class (Typeable carg, Typeable cres) => CtlCmd carg cres | carg -> cresSource

Class of control commands for enclosing enumerators. The class binds each control argument type to a unique result type.

data CtlRes a Source

The outcome of an IterC request.

Constructors

CtlUnsupp

The request type was not supported by the enumerator.

CtlFail !SomeException

The request was supported, and executing it caused an exception to be thrown.

CtlDone !a

The result of the control operation.

Instances

data CtlArg t m a Source

Used when an Iter is issuing a control request to an enclosing enumerator. Note that unlike IterF or IterM, control requests expose the residual data, which is ordinarily fed right back to the continuation upon execution of the request. This allows certain control operations (such as seek and tell) to flush, check the length of, or adjust the residual data.

Constructors

forall carg cres . CtlCmd carg cres => CtlArg !carg (CtlRes cres -> Iter t m a) (Chunk t) 

data IterFail Source

Contains information about a failed Iter. Failures of type IterException must be caught by catchI (or tryI, etc.). However, any other type of failure is considered a parse error, and will be caught by multiParse, ifParse, and mplus.

Constructors

IterException !SomeException

An actual error occured that is not a parse error, EOF, etc.

IterExpected [(String, String)]

List of (input_seen, input_expected) pairs.

IterEOFErr IOError

An EOF error occurred, either in some IO action wrapped by liftIO, or in some Iter that called throwEOFI.

IterParseErr String

A miscellaneous parse error occured.

IterMzero

What you get from mzero. Useful if you don't want to specify any information about the failure.

data IterR t m a Source

An IterR is the result of feeding a chunk of data to an Iter. An IterR is in one of several states: it may require more input (IterF), it may wish to execute monadic actions in the transformed monad (IterM), it may have a control request for an enclosing enumerator (IterC), it may have produced a result (Done), or it may have failed (Fail).

Constructors

IterF !(Iter t m a)

The iteratee requires more input.

IterM !(m (IterR t m a))

The iteratee must execute monadic bind in monad m

IterC !(CtlArg t m a)

A control request (see CtlArg).

Done a (Chunk t)

Sufficient input was received; the Iter is returning a result of type a. In adition, the IterR has a Chunk containing any residual input that was not consumed in producing the result.

Fail !IterFail !(Maybe a) !(Maybe (Chunk t))

The Iter failed. If it was an enumerator, the target Iter that the enumerator was feeding likely has not failed, in which case its current state is returned in the Maybe a. If it makes sense to preserve the state of the input stream (which it does for most errors except parse errors), then the third parameter includes the residual Chunk at the time of the failure.

Instances

Monad m => Functor (IterR t m) 
ChunkData t => Show (IterR t m a) 

iterF :: ChunkData t => (Chunk t -> IterR t m a) -> Iter t m aSource

Builds an Iter that keeps requesting input until it receives a non-null Chunk. In other words, the Chunk fed to the argument function is guaranteed either to contain data or to have the EOF flag true (or both).

isIterActive :: IterR t m a -> BoolSource

True if an IterR is requesting something from an enumerator--i.e., the IterR is not Done or Fail.

iterShows :: (ChunkData t, Show a) => IterR t m a -> ShowSSource

Show the current state of an IterR, prepending it to some remaining input (the standard ShowS optimization), when a is in class Show. Note that if a is not in Show, you can simply use the shows function.

iterShow :: (ChunkData t, Show a) => IterR t m a -> StringSource

Show the current state of an Iter if type a is in the Show class. (Otherwise, you can simply use the ordinary show function.)

Execution

run :: (ChunkData t, Monad m) => Iter t m a -> m aSource

Feed an EOF to an Iter and return the result. Throws an exception if there has been a failure.

runI :: (ChunkData t1, ChunkData t2, Monad m) => Iter t1 m a -> Iter t2 m aSource

Runs an Iter from within a different Iter monad. If successful, runI iter will produce the same result as lift (run iter). However, if iter fails, run throws a language-level exception, which cannot be caught within other Iter monads. By contrast, runI throws a monadic exception that can be caught. In short, use runI in preference to run in situations where both are applicable. See a more detailed discussion of the same issue with examples in the documentation for .|$ in Data.IterIO.Inum.

Exception types

mkIterEOF :: String -> IterFailSource

Make an IterEOFErr from a String.

data IterCUnsupp Source

Exception thrown by CtlI when the type of the control request is not supported by the enclosing enumerator.

Constructors

forall carg cres . CtlCmd carg cres => IterCUnsupp carg 

Exception-related functions

throwI :: Exception e => e -> Iter t m aSource

Throw an exception from an Iteratee. The exception will be propagated properly through nested Iteratees, which will allow it to be categorized properly and avoid situations in which resources such as file handles are not released. (Most iteratee code does not assume the Monad parameter m is in the MonadIO class, and hence cannot use catch or onException to clean up after exceptions.) Use throwI in preference to throw whenever possible.

Do not use throwI to throw parse errors or EOF errors. Use throwEOFI and throwParseI instead. For performance reasons, the IterFail type segregates EOF and parse errors from other types of failures.

throwEOFI :: String -> Iter t m aSource

Throw an exception of type IterEOF. This will be interpreted by mkInum as an end of file chunk when thrown by the codec. It will also be interpreted by ifParse and multiParse as parsing failure. If not caught within the Iter monad, the exception will be rethrown by run (and hence |$) as an IOError of type EOF.

throwParseI :: String -> Iter t m aSource

Throw a miscellaneous parse error (after which input is assumed to be unsynchronized and thus is discarded). Parse errors may be caught as exception type IterFail, but they can also be caught more efficiently by the functions multiParse, ifParse, and mplus.

catchISource

Arguments

:: (Exception e, ChunkData t, Monad m) 
=> Iter t m a

Iter that might throw an exception

-> (e -> IterR t m a -> Iter t m a)

Exception handler, which gets as arguments both the exception and the failing Iter state.

-> Iter t m a 

Catch an exception thrown by an Iter, including exceptions thrown by any Inums fused to the Iter (or applied to it with .|$). If you wish to catch just errors thrown within Inums, see the function inumCatch in Data.IterIO.Inum.

On exceptions, catchI invokes a handler passing it both the exception thrown and the state of the failing IterR, which may contain more information than just the exception. In particular, if the exception occured in an Inum, the returned IterR will also contain the IterR being fed by that Inum, which likely will not have failed. To avoid discarding this extra information, you should not re-throw exceptions with throwI. Rather, you should re-throw an exception by re-executing the failed IterR with reRunIter. For example, a possible definition of onExceptionI is:

  onExceptionI iter cleanup =
      iter `catchI` \(SomeException _) r -> cleanup >> reRunIter r

Note that catchI only works for synchronous exceptions, such as IO errors (thrown within liftIO blocks), the monadic fail operation, and exceptions raised by throwI. It is not possible to catch asynchronous exceptions, such as lazily evaluated divide-by-zero errors, the throw function, or exceptions raised by other threads using throwTo if those exceptions might arrive anywhere outside of a liftIO call.

`catchI` has the default infix precedence (infixl 9 `catchI`), which binds more tightly than any concatenation or fusing operators.

catchPI :: (ChunkData t, Monad m) => Iter t m a -> (IterFail -> Iter t m a) -> Iter t m aSource

Like catchI, but catches only what are considered to be parse errors--that is, every constructor of IterFail except IterException.

tryI :: (ChunkData t, Monad m, Exception e) => Iter t m a -> Iter t m (Either (e, IterR t m a) a)Source

If an Iter succeeds and returns a, returns Right a. If the Iter fails and throws an exception e (of type e), returns Left (e, r) where r is the state of the failing Iter.

tryFI :: (ChunkData t, Monad m) => Iter t m a -> Iter t m (Either IterFail a)Source

A varient of tryI that returns the IterFail state rather than trying to match a particular exception.

tryRI :: (ChunkData t, Monad m) => Iter t m a -> Iter t m (Either (IterR t m a) a)Source

A version of tryI that catches all exceptions. Instead of returning the exception caught, it returns the failing IterR (from which you can extract the exception if you really want it). The main use of this is for doing some kind of clean-up action, then re-throwing the exception with reRunIter.

For example, the following is a possible implementation of finallyI:

 finallyI iter cleanup = do
   er <- tryRI iter
   cleanup
   either reRunIter return er

tryEOFI :: (ChunkData t, Monad m) => Iter t m a -> Iter t m (Maybe a)Source

A variant of tryI that just catches EOF errors. Returns Nothing after an EOF error, and Just the result otherwise. Should be much faster than trying to catch an EOF error of type Exception.

finallyI :: (ChunkData t, Monad m) => Iter t m a -> Iter t m b -> Iter t m aSource

Execute an Iter, then perform a cleanup action regardless of whether the Iter threw an exception or not. Analogous to the standard library function finally.

onExceptionI :: (ChunkData t, Monad m) => Iter t m a -> Iter t m b -> Iter t m aSource

Execute an Iter and perform a cleanup action if the Iter threw an exception. Analogous to the standard library function onException.

tryBI :: (ChunkData t, Monad m, Exception e) => Iter t m a -> Iter t m (Either e a)Source

Simlar to tryI, but saves all data that has been fed to the Iter, and rewinds the input if the Iter fails. (The B in tryBI stands for "backtracking".) Thus, if tryBI returns Left exception, the next Iter to be invoked will see the same input that caused the previous Iter to fail. (For this reason, it makes no sense ever to call resumeI on the Iter you get back from tryBI, which is why tryBI does not return the failing Iteratee the way tryI does.)

Because tryBI saves a copy of all input, it can consume a lot of memory and should only be used when the Iter argument is known to consume a bounded amount of data.

tryFBI :: (ChunkData t, Monad m) => Iter t m a -> Iter t m (Either IterFail a)Source

A variant of tryBI that, also rewinds input on failure, but returns the raw IterFail structure, rather than mapping it to a particular exception. This is much faster because it requires no dynamic casts. However, the same warning applies that tryFBI should not be applied to Iters that could take unbounded input.

ifParseSource

Arguments

:: (ChunkData t, Monad m) 
=> Iter t m a

Iteratee iter to run with backtracking

-> (a -> Iter t m b)

success function

-> Iter t m b

failure action

-> Iter t m b

result

ifParse iter success failure runs iter, but saves a copy of all input consumed using tryFBI. (This means iter must not consume unbounded amounts of input! See multiParse for such cases.) If iter succeeds, its result is passed to the function success. If iter throws a parse error (with throwParseI), throws an EOF error (with throwEOFI), or executes mzero, then failure is executed with the input re-wound (so that failure is fed the same input that iter was). If iter throws any other type of exception, ifParse passes the exception back and does not execute failure.

See Data.IterIO.Parse for a discussion of this function and the related infix operator \/ (which is a synonym for ifNoParse).

ifNoParse :: (ChunkData t, Monad m) => Iter t m a -> Iter t m b -> (a -> Iter t m b) -> Iter t m bSource

ifNoParse is just ifParse with the second and third arguments reversed.

multiParse :: (ChunkData t, Monad m) => Iter t m a -> Iter t m a -> Iter t m aSource

Try two Iteratees and return the result of executing the second if the first one throws a parse, EOF, or mzero error. Note that Data.IterIO.Parse defines <|> as an infix synonym for this function.

The statement multiParse a b is similar to ifParse a return b, but the two functions operate differently. Depending on the situation, only one of the two formulations may be correct. Specifically:

  • ifParse a f b works by first executing a, saving a copy of all input consumed by a. If a throws a parse error, the saved input is used to backtrack and execute b on the same input that a just rejected. If a succeeds, b is never run; a's result is fed to f, and the resulting action is executed without backtracking (so any error thrown within f will not be caught by this ifParse expression).
  • Instead of saving input, multiParse a b executes both a and b concurrently as input chunks arrive. If a throws a parse error, then the result of executing b is returned. If a either succeeds or throws an exception that is not a parse errorEOFmzero, then the result of running a is returned.
  • With multiParse a b, if b returns a value, executes a monadic action via lift, or issues a control request via ctlI, then further processing of b will be suspended until a experiences a parse error, and thus the behavior will be equivalent to ifParse a return b.

The main restriction on ifParse is that a must not consume unbounded amounts of input, or the program may exhaust memory saving the input for backtracking. Note that the second argument to ifParse (i.e., return in ifParse a return b) is a continuation for a when a succeeds.

The advantage of multiParse is that it can avoid storing unbounded amounts of input for backtracking purposes if both Iters consume data. Another advantage is that with an expression such as ifParse a f b, sometimes it is not convenient to break the parse target into an action to execute with backtracking (a) and a continuation to execute without backtracking (f). The equivalent multiParse (a >>= f) b avoids the need to do this, since it does not do backtracking.

However, it is important to note that it is still possible to end up storing unbounded amounts of input with multiParse. For example, consider the following code:

 total :: (Monad m) => Iter String m Int
 total = multiParse parseAndSumIntegerList (return -1) -- Bad

Here the intent is for parseAndSumIntegerList to parse a (possibly huge) list of integers and return their sum. If there is a parse error at any point in the input, then the result is identical to having defined total = return -1. But return -1 succeeds immediately, consuming no input, which means that total must return all left-over input for the next action (i.e., next in total >>= next). Since total has to look arbitrarily far into the input to determine that parseAndSumIntegerList fails, in practice total will have to save all input until it knows that parseAndSumIntegerList succeeds.

A better approach might be:

   total = multiParse parseAndSumIntegerList (nullI >> return -1)

Here nullI discards all input until an EOF is encountered, so there is no need to keep a copy of the input around. This makes sense so long as total is the last or only Iteratee run on the input stream. (Otherwise, nullI would have to be replaced with an Iteratee that discards input up to some end-of-list marker.)

Another approach might be to avoid parsing combinators entirely and use:

   total = parseAndSumIntegerList `catchPI` handler
       where handler _ = return -1

This last definition of total may leave the input in some partially consumed state. This is fine so long as total is the last Iter executed on the input stream. Otherwise, before throwing the parse error, parseAndSumIntegerList would need to ensure the input is at some reasonable boundary point for whatever comes next. (The ungetI function is sometimes helpful for this purpose.)

Some basic Iters

nullI :: (Monad m, ChunkData t) => Iter t m ()Source

Sinks data like /dev/null, returning () on EOF.

data0I :: ChunkData t => Iter t m tSource

Returns a non-empty amount of input data if there is any input left. Returns mempty on an EOF condition.

dataI :: ChunkData t => Iter t m tSource

Like data0I, but always returns non-empty data. Throws an exception on an EOF condition.

pureI :: (Monad m, ChunkData t) => Iter t m tSource

A variant of data0I that reads the whole input up to an EOF and returns it.

chunkI :: (Monad m, ChunkData t) => Iter t m (Chunk t)Source

Returns the next Chunk that either contains non-null data or has the EOF bit set.

someI :: (ChunkData tOut, Monad m) => Iter tIn m tOut -> Iter tIn m tOutSource

Run an Iter returning data of class ChunkData and throw an EOF exception if the data is null. (Note that this is different from the some method of the Alternative class in Control.Applicative, which executes a computation one or more times. The iterIO library does not use Alternative, in part because Alternative's <|> operator has left rather than right fixity, which would make parsing less efficient. See Data.IterIO.Parse for information about iterIO's <|> operator.)

whileNullI :: (ChunkData tIn, ChunkData tOut, Monad m) => Iter tIn m tOut -> Iter tIn m tOutSource

Keep running an Iter until either its output is not null or we have reached EOF. Return the the Iter's value on the last (i.e., usually non-null) iteration.

peekI :: (ChunkData t, Monad m) => Iter t m a -> Iter t m aSource

Runs an Iter then rewinds the input state, so that the effect is to parse lookahead data. (See tryBI if you want to rewind the input only when the Iter fails.)

atEOFI :: (Monad m, ChunkData t) => Iter t m BoolSource

Does not actually consume any input, but returns True if there is no more input data to be had.

ungetI :: ChunkData t => t -> Iter t m ()Source

Place data back onto the input stream, where it will be the next data consumed by subsequent Iters.

safeCtlI :: (CtlCmd carg cres, Monad m) => carg -> Iter t m (CtlRes cres)Source

Issue a control request. Returns CtlUnsupp if the request type is unsupported. Otherwise, returns CtlDone with the result if the request succeeds, or return CtlFail if the request type is supported but attempting to execute the request caused an exception.

ctlI :: (CtlCmd carg cres, ChunkData t, Monad m) => carg -> Iter t m cresSource

Issue a control request and return the result. Throws an exception of type IterCUnsupp if the operation type was not supported by an enclosing enumerator.

Internal functions

onDone :: Monad m => (IterR t m a -> IterR t m b) -> Iter t m a -> Iter t m bSource

Run an Iter until it enters the Done or Fail state, then use a function to transform the IterR.

fmapI :: Monad m => (a -> b) -> Iter t m a -> Iter t m bSource

fmapI is like liftM, but differs in one important respect: it preserves the failed result of an enumerator (and in fact applies the function to the non-failed target Iter state). By contrast, liftM, which is equivalent to liftM f i = i >>= return . f, transforms the Maybe a component of all Fail states to Nothing because of its use of >>=.

onDoneR :: Monad m => (IterR t m a -> IterR t m b) -> IterR t m a -> IterR t m bSource

The equivalent of onDone for IterRs.

stepRSource

Arguments

:: Monad m 
=> IterR t m a

The Iter that needs to be stepped

-> (IterR t m a -> IterR t m b)

Function to pass the Iter to after stepping it.

-> IterR t m b

Fallback if the Iter can no longer be stepped

-> IterR t m b 

Step an active IterR (i.e., one in the IterF, IterM, or IterC state) to its next state, and pass the result through a function.

stepR'Source

Arguments

:: IterR t m1 a

The IterR that needs to be stepped.

-> (IterR t m1 a -> IterR t m2 b)

Transformation function if the IterR is in the IterF or IterC state.

-> IterR t m2 b

Fallback if the IterR is no longer active.

-> IterR t m2 b 

A variant of stepR that only works for the IterF and IterC states, not the IterM state. (Because of this additional restriction, the input and output Monad types m1 and m2 do not need to be the same.)

runR :: (ChunkData t1, ChunkData t2, Monad m) => IterR t1 m a -> IterR t2 m aSource

The equivalent for runI for IterRs.

fmapR :: (a -> b) -> IterR t m1 a -> IterR t m2 bSource

Maps the result of an IterR like fmap, but only if the IterR is no longer active. It is an error to call this function on an IterR in the IterF, IterM, or IterC state. Because of this restriction, fmapR does not require the input and output Monad types (m1 and m2) to be the same.

reRunIter :: (ChunkData t, Monad m) => IterR t m a -> Iter t m aSource

Turn an IterR back into an Iter.

runIterR :: (ChunkData t, Monad m) => IterR t m a -> Chunk t -> IterR t m aSource

Feed more input to an Iter that has already been run (and hence is already an IterR). In the event that the IterR is requesting more input (i.e., is in the IterF state), this is straight forward. However, if the Iter is in some other state such as IterM, this function needs to save the input until such time as the IterR is stepped to a new state (e.g., with stepR or reRunIter).

getResid :: ChunkData t => IterR t m a -> Chunk tSource

Get the residual data for an IterR that is in no longer active or that is in the IterC state. (It is an error to call this function on an IterR in the IterF or IterM state.)

setResid :: IterR t1 m1 a -> Chunk t2 -> IterR t2 m2 aSource

Set residual data for an IterR that is not active. (It is an error to call this on an IterR in the Done, IterM, or IterC states.)