streamly-core-0.2.0: Streaming, parsers, arrays and more
Copyright(c) 2023 Composewell Technologies
LicenseBSD-3-Clause
Maintainerstreamly@composewell.com
Stabilitypre-release
PortabilityGHC
Safe HaskellSafe-Inferred
LanguageHaskell2010

Streamly.Data.ParserK

Description

See the general notes about parsing in the Streamly.Data.Parser module. This module implements a using Continuation Passing Style (CPS) wrapper over the Streamly.Data.Parser module. It is as fast or faster than attoparsec.

Parser vs ParserK

ParserK is preferred over Parser when extensive applicative, alternative and monadic composition is required, or when recursive or dynamic composition of parsers is required. The Parser type fuses statically and creates efficient loops whereas ParserK uses function call based composition and has comparatively larger runtime overhead but it is better suited to the specific use cases mentioned above. ParserK also allows to efficient parse a stream of arrays, it can also break the input stream into a parse result and remaining stream so that the stream can be parsed independently in segments.

Using ParserK

All the parsers from the Streamly.Data.Parser module can be adapted to ParserK using the adaptC, adapt, and adaptCG combinators.

parseChunks runs a parser on a stream of unboxed arrays, this is the preferred and most efficient way to parse chunked input. The more general parseBreakChunks function returns the remaining stream as well along with the parse result. There are parseChunksGeneric, parseBreakChunksGeneric as well to run parsers on boxed arrays. parse, parseBreak run parsers on a stream of individual elements instead of stream of arrays.

Monadic Composition

Monad composition can be used for lookbehind parsers, we can dynamically compose new parsers based on the results of the previously parsed values.

If we have to parse "a9" or "9a" but not "99" or "aa" we can use the following non-monadic, backtracking parser:

>>> digits p1 p2 = ((:) <$> p1 <*> ((:) <$> p2 <*> pure []))
>>> :{
backtracking :: Monad m => ParserK Char m String
backtracking = ParserK.adapt $
    digits (Parser.satisfy isDigit) (Parser.satisfy isAlpha)
    <|>
    digits (Parser.satisfy isAlpha) (Parser.satisfy isDigit)
:}

We know that if the first parse resulted in a digit at the first place then the second parse is going to fail. However, we waste that information and parse the first character again in the second parse only to know that it is not an alphabetic char. By using lookbehind in a Monad composition we can avoid redundant work:

>>> data DigitOrAlpha = Digit Char | Alpha Char
>>> :{
lookbehind :: Monad m => ParserK Char m String
lookbehind = do
    x1 <- ParserK.adapt $
             Digit <$> Parser.satisfy isDigit
         <|> Alpha <$> Parser.satisfy isAlpha
    -- Note: the parse depends on what we parsed already
    x2 <- ParserK.adapt $
          case x1 of
             Digit _ -> Parser.satisfy isAlpha
             Alpha _ -> Parser.satisfy isDigit
    return $ case x1 of
        Digit x -> [x,x2]
        Alpha x -> [x,x2]
:}

Experimental APIs

Please refer to Streamly.Internal.Data.ParserK for functions that have not yet been released.

Synopsis

Setup

To execute the code examples provided in this module in ghci, please run the following commands first.

>>> :m
>>> import Control.Applicative ((<|>))
>>> import Data.Char (isDigit, isAlpha)
>>> import Streamly.Data.Parser (Parser)
>>> import Streamly.Data.ParserK (ParserK)
>>> import qualified Streamly.Data.Parser as Parser
>>> import qualified Streamly.Data.ParserK as ParserK

For APIs that have not been released yet.

>>> import qualified Streamly.Internal.Data.ParserK as ParserK

Parser Type

data ParserK a m b Source #

A continuation passing style parser representation. A continuation of Steps, each step passes a state and a parse result to the next Step. The resulting Step may carry a continuation that consumes input a and results in another Step. Essentially, the continuation may either consume input without a result or return a result with no further input to be consumed.

Instances

Instances details
Monad m => MonadFail (ParserK a m) Source # 
Instance details

Defined in Streamly.Internal.Data.ParserK.Type

Methods

fail :: String -> ParserK a m a0 #

MonadIO m => MonadIO (ParserK a m) Source # 
Instance details

Defined in Streamly.Internal.Data.ParserK.Type

Methods

liftIO :: IO a0 -> ParserK a m a0 #

Monad m => Alternative (ParserK a m) Source #

p1 <|> p2 passes the input to parser p1, if it succeeds, the result is returned. However, if p1 fails, the parser driver backtracks and tries the same input on the alternative parser p2, returning the result if it succeeds.

Instance details

Defined in Streamly.Internal.Data.ParserK.Type

Methods

empty :: ParserK a m a0 #

(<|>) :: ParserK a m a0 -> ParserK a m a0 -> ParserK a m a0 #

some :: ParserK a m a0 -> ParserK a m [a0] #

many :: ParserK a m a0 -> ParserK a m [a0] #

Monad m => Applicative (ParserK a m) Source #

f <$> p1 <*> p2 applies parsers p1 and p2 sequentially to an input stream. The first parser runs and processes the input, the remaining input is then passed to the second parser. If both parsers succeed, their outputs are applied to the function f. If either parser fails, the operation fails.

Instance details

Defined in Streamly.Internal.Data.ParserK.Type

Methods

pure :: a0 -> ParserK a m a0 #

(<*>) :: ParserK a m (a0 -> b) -> ParserK a m a0 -> ParserK a m b #

liftA2 :: (a0 -> b -> c) -> ParserK a m a0 -> ParserK a m b -> ParserK a m c #

(*>) :: ParserK a m a0 -> ParserK a m b -> ParserK a m b #

(<*) :: ParserK a m a0 -> ParserK a m b -> ParserK a m a0 #

Functor m => Functor (ParserK a m) Source #

Map a function on the result i.e. on b in Parser a m b.

Instance details

Defined in Streamly.Internal.Data.ParserK.Type

Methods

fmap :: (a0 -> b) -> ParserK a m a0 -> ParserK a m b #

(<$) :: a0 -> ParserK a m b -> ParserK a m a0 #

Monad m => Monad (ParserK a m) Source #

Monad composition can be used for lookbehind parsers, we can dynamically compose new parsers based on the results of the previously parsed values.

Instance details

Defined in Streamly.Internal.Data.ParserK.Type

Methods

(>>=) :: ParserK a m a0 -> (a0 -> ParserK a m b) -> ParserK a m b #

(>>) :: ParserK a m a0 -> ParserK a m b -> ParserK a m b #

return :: a0 -> ParserK a m a0 #

Monad m => MonadPlus (ParserK a m) Source #

mzero is same as empty, it aborts the parser. mplus is same as <|>, it selects the first succeeding parser.

Instance details

Defined in Streamly.Internal.Data.ParserK.Type

Methods

mzero :: ParserK a m a0 #

mplus :: ParserK a m a0 -> ParserK a m a0 -> ParserK a m a0 #

Parsers

Conversions

adapt :: Monad m => Parser a m b -> ParserK a m b Source #

Convert a Parser to ParserK.

Pre-release

adaptC :: (Monad m, Unbox a) => Parser a m b -> ParserK (Array a) m b Source #

Convert an element Parser to a chunked ParserK. A chunked parser is more efficient than an element parser.

Pre-release

adaptCG :: Monad m => Parser a m b -> ParserK (Array a) m b Source #

A generic adaptC. Similar to adaptC but is not constrained to Unbox types.

Pre-release

Without Input

fromPure :: b -> ParserK a m b Source #

A parser that always yields a pure value without consuming any input.

Pre-release

fromEffect :: Monad m => m b -> ParserK a m b Source #

See fromEffect.

Pre-release

die :: String -> ParserK a m b Source #

A parser that always fails with an error message without consuming any input.

Pre-release

Deprecated

fromFold :: (MonadIO m, Unbox a) => Fold m a b -> ParserK (Array a) m b Source #

Deprecated: Please use "ParserK.adaptC . Parser.fromFold" instead.

fromParser :: (MonadIO m, Unbox a) => Parser a m b -> ParserK (Array a) m b Source #

Deprecated: Please use "adaptC" instead.