Safe Haskell | Safe |
---|---|
Language | Haskell98 |
Parsec combinators are not composable by default. The idea is that this increases efficiency
by eliminating unnecessary backtracking. The practical problem with this approach is that it
is difficult to determine exactly where backtracking really is necessary and insert the needed
try
combinators. Any mistakes are almost always subtle and difficult to diagnose and fix.
Compute power is cheap and programmers are expensive, so it only makes sense to make it easy on the programmer first and on the computer second. This module is mostly full of composable drop-in replacements for Parsec combinators. There's also some renaming, to make the API more idiomatic. Also, some additional combinators are exported.
When migrating to this API, it is recommended to import Parsec modules qualified so that your code will be composable by default. Where efficiency really is needed, you can carefully but easily fall back to non-composable Parsec combinators by namespacing the appropriate combinators.
One place where we could not provide a replacement was the <|>
operator. This is exported by
the Alternative
typeclass, which we really don't want to mess with. The composable alternative
is spelled <||>
.
The re-named parsers are option
, optional
, optional_
, many_
, and many1_
. It may not
be easiest to switch old habits, but these names are more idiomatic and also reduce the
amount of typing necessary. Also, we've altered the semantics of manyTill
to make room for the
new manyThru
combinator. This gives the user an easy choice whether to consume the terminating
element. Finally, we've changed the type of notFollowedBy
to allow writing
x `notfollowedBy` y
in place of x <* notFollowedBy y
.
Below are some selected examples where this library is more intuitive:
- In Parsec,
string "aaa" <|> string "aa"
, will fail on input"aa"
. Using this module's<||>
will succeed. - In Parsec,
(char 'a' `sepBy` char ' ') *> optional (char ' ') *> eof
will fail on input"a a "
. Using this module'ssepBy
will succeed. Similar results hold forsepBy1
,chainl
,chainl1
,chainr
, andchainr1
. - In Parsec,
anyChar `manyTill` string "-->" *> eof
will fail on input"part1 -- part2-->"
. Using this module'smanyThru
will succeed with the same semantics. This module'smanyTill
will not consume the"-->"
. - In Parsec,
many (char
will fail on inputa
<* char ',') <* chara
<* eofa,a
. Using this module'smany
will succeed. Similar results hold formany1
,many_
,many1_
,endBy
andendby1
.
While we're at it, we've also re-exported applicative parsing functions and defined some of our
own combinators that have been found useful. Applicative parsing is recommended over monadic
parsing where it will suffice, so we'd rather eliminate the extra Control.Applicative
import.
Among the additional combinators defined here are dispatch
, atLeast
, atMost
, manyNM
,
chomp
, and between2
.
- (<$>) :: Functor f => (a -> b) -> f a -> f b
- (<$$>) :: Functor f => f a -> (a -> b) -> f b
- (<*>) :: Applicative f => forall a b. f (a -> b) -> f a -> f b
- (*>) :: Applicative f => forall a b. f a -> f b -> f b
- (<*) :: Applicative f => forall a b. f a -> f b -> f a
- (<**>) :: Applicative f => f a -> f (a -> b) -> f b
- (<$) :: Functor f => forall a b. a -> f b -> f a
- pure :: Applicative f => forall a. a -> f a
- void :: Functor f => f a -> f ()
- (<||>) :: Stream s m t => ParsecT s u m a -> ParsecT s u m a -> ParsecT s u m a
- choice :: Stream s m t => [ParsecT s u m a] -> ParsecT s u m a
- dispatch :: Stream s m t => [(ParsecT s u m a, ParsecT s u m b)] -> ParsecT s u m b
- longestOf :: Stream s m t => [ParsecT s u m a] -> ParsecT s u m a
- option :: Stream s m t => a -> ParsecT s u m a -> ParsecT s u m a
- optional :: Stream s m t => ParsecT s u m a -> ParsecT s u m (Maybe a)
- optional_ :: Stream s m t => ParsecT s u m a -> ParsecT s u m ()
- many :: Stream s m t => ParsecT s u m a -> ParsecT s u m [a]
- many1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m [a]
- many_ :: Stream s m t => ParsecT s u m a -> ParsecT s u m ()
- many1_ :: Stream s m t => ParsecT s u m a -> ParsecT s u m ()
- count :: Stream s m t => Int -> ParsecT s u m a -> ParsecT s u m [a]
- atLeast :: Stream s m t => Int -> ParsecT s u m a -> ParsecT s u m [a]
- atMost :: Stream s m t => Int -> ParsecT s u m a -> ParsecT s u m [a]
- manyNM :: Stream s m t => Int -> Int -> ParsecT s u m a -> ParsecT s u m [a]
- manyOf :: Stream s m t => [ParsecT s u m a] -> ParsecT s u m [a]
- manyOf_ :: Stream s m t => [ParsecT s u m a] -> ParsecT s u m ()
- manyTill :: Stream s m t => ParsecT s u m a -> ParsecT s u m end -> ParsecT s u m [a]
- manyThru :: Stream s m t => ParsecT s u m a -> ParsecT s u m end -> ParsecT s u m [a]
- chomp :: Stream s m t => ParsecT s u m a -> ParsecT s u m trash -> ParsecT s u m a
- between :: Stream s m t => ParsecT s u m open -> ParsecT s u m close -> ParsecT s u m a -> ParsecT s u m a
- between2 :: Stream s m t => ParsecT s u m around -> ParsecT s u m a -> ParsecT s u m a
- sepBy :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]
- sepBy1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]
- sepEndBy :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]
- sepEndBy1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]
- endBy :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]
- endBy1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]
- sepAroundBy :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]
- sepAroundBy1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a]
- chainl :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a -> a -> a) -> a -> ParsecT s u m a
- chainl1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a -> a -> a) -> ParsecT s u m a
- chainr :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a -> a -> a) -> a -> ParsecT s u m a
- chainr1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a -> a -> a) -> ParsecT s u m a
- lookAhead :: Stream s m t => ParsecT s u m a -> ParsecT s u m a
- lookAhead_ :: Stream s m t => ParsecT s u m a -> ParsecT s u m ()
- notFollowedBy :: (Stream s m t, Show trash) => ParsecT s u m a -> ParsecT s u m trash -> ParsecT s u m a
- atEndOfInput :: (Stream s m t, Show t) => ParsecT s u m Bool
- endOfInput :: (Stream s m t, Show t) => ParsecT s u m ()
- allInput :: (Stream s m t, Show t) => ParsecT s u m a -> ParsecT s u m a
- withRemainingInput :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a, s)
- (<?>) :: ParsecT s u m a -> String -> ParsecT s u m a
- expect :: Stream s m t => String -> ParsecT s u m a -> ParsecT s u m a
- withPosition :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a, SourcePos)
- withPositionEnd :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a, SourcePos)
- withPositions :: Stream s m t => ParsecT s u m a -> ParsecT s u m (SourcePos, a, SourcePos)
- try :: ParsecT s u m a -> ParsecT s u m a
- (<|>) :: ParsecT s u m a -> ParsecT s u m a -> ParsecT s u m a
- unexpected :: Stream s m t => String -> ParsecT s u m a
Applicative Parsing
(<$>) :: Functor f => (a -> b) -> f a -> f b infixl 4
An infix synonym for fmap
.
Examples
Convert from a
to a Maybe
Int
using Maybe
String
show
:
>>>
show <$> Nothing
Nothing>>>
show <$> Just 3
Just "3"
Convert from an
to an Either
Int
Int
Either
Int
String
using show
:
>>>
show <$> Left 17
Left 17>>>
show <$> Right 17
Right "17"
Double each element of a list:
>>>
(*2) <$> [1,2,3]
[2,4,6]
Apply even
to the second element of a pair:
>>>
even <$> (2,2)
(2,True)
(<$$>) :: Functor f => f a -> (a -> b) -> f b infixl 4 Source
Flipped <$>
.
Great for parsing infixes, e.g. addExpr = expr <$$> (+) <*> expr
.
(<*>) :: Applicative f => forall a b. f (a -> b) -> f a -> f b
Sequential application.
(*>) :: Applicative f => forall a b. f a -> f b -> f b
Sequence actions, discarding the value of the first argument.
(<*) :: Applicative f => forall a b. f a -> f b -> f a
Sequence actions, discarding the value of the second argument.
(<**>) :: Applicative f => f a -> f (a -> b) -> f b infixl 4
A variant of <*>
with the arguments reversed.
pure :: Applicative f => forall a. a -> f a
Lift a value.
void :: Functor f => f a -> f ()
discards or ignores the result of evaluation, such
as the return value of an void
valueIO
action.
Examples
Replace the contents of a
with unit:Maybe
Int
>>>
void Nothing
Nothing>>>
void (Just 3)
Just ()
Replace the contents of an
with unit,
resulting in an Either
Int
Int
:Either
Int
'()'
>>>
void (Left 8675309)
Left 8675309>>>
void (Right 8675309)
Right ()
Replace every element of a list with unit:
>>>
void [1,2,3]
[(),(),()]
Replace the second element of a pair with unit:
>>>
void (1,2)
(1,())
Discard the result of an IO
action:
>>>
mapM print [1,2]
1 2 [(),()]>>>
void $ mapM print [1,2]
1 2
Choices
(<||>) :: Stream s m t => ParsecT s u m a -> ParsecT s u m a -> ParsecT s u m a infixl 3 Source
p <||> q
tries to parse p
, but if it fails, parses q
.
Unlike the Alternative
instance for ParsecT
, backtracking will occur if p
fails.
That is, a parser such as string "flange" <||> string "fly"
will succeed on
the input "fly"
, whereas its Parsec counterpart will unintuitively fail.
choice :: Stream s m t => [ParsecT s u m a] -> ParsecT s u m a Source
choice ps
tries to apply the parsers in the list ps
in order,
until one of them succeeds. Returns the value of the succeeding
parser. Unlike the Parsec version, this one ensures that parsers
do not consume input if they fail.
dispatch :: Stream s m t => [(ParsecT s u m a, ParsecT s u m b)] -> ParsecT s u m b Source
Given a map of parsers to parsers, attempt each key parser until one succeeds, then perform the value parser. Return the result of the value parser.
longestOf :: Stream s m t => [ParsecT s u m a] -> ParsecT s u m a Source
Attempt all of the passed parsers under the current conditions and return the value of the parser which makes it furthest into the input stream (and updates the parser's internals as if that were the only parser parsed).
longestOf [string "do", string "don't"]
Zero or One
option :: Stream s m t => a -> ParsecT s u m a -> ParsecT s u m a Source
option x p
tries to apply parser p
. If p
fails, no input is consumed and x
is returned.
optional :: Stream s m t => ParsecT s u m a -> ParsecT s u m (Maybe a) Source
optional p
tries to parse p
, but does not fail or consume input of p
fails.
This is like optionMaybe
, but is easier to type. See optional_
.
optional_ :: Stream s m t => ParsecT s u m a -> ParsecT s u m () Source
optional_ p
tries to parse p
, but does not fail or consume input if p
fails.
This is like Parsec's Text.Parsec.Combinator.optional', but the use of underscore is more idiomatic for actions whose results are ignored.
Many
many :: Stream s m t => ParsecT s u m a -> ParsecT s u m [a] Source
many p
applies the parser p
zero or more times and accumulates the result.
many1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m [a] Source
many p
applies the parser p
one or more times and accumulates the result.
many_ :: Stream s m t => ParsecT s u m a -> ParsecT s u m () Source
many1_ p
applies the parser p
zero or more times, skipping
its result.
many1_ :: Stream s m t => ParsecT s u m a -> ParsecT s u m () Source
many1_ p
applies the parser p
one or more times, skipping
its result.
count :: Stream s m t => Int -> ParsecT s u m a -> ParsecT s u m [a]
count n p
parses n
occurrences of p
. If n
is smaller or
equal to zero, the parser equals to return []
. Returns a list of
n
values returned by p
.
atLeast :: Stream s m t => Int -> ParsecT s u m a -> ParsecT s u m [a] Source
atLeast n p
applies the parser p
n
or more times.
atMost :: Stream s m t => Int -> ParsecT s u m a -> ParsecT s u m [a] Source
atMost n p
applies the parser p
up to n
times.
manyNM :: Stream s m t => Int -> Int -> ParsecT s u m a -> ParsecT s u m [a] Source
manyNM n m p
applies the parser p
n
or more times up to m
times.
manyOf :: Stream s m t => [ParsecT s u m a] -> ParsecT s u m [a] Source
Parse zero or more of any mix of the passed parsers.
manyOf_ :: Stream s m t => [ParsecT s u m a] -> ParsecT s u m () Source
As manyOf
, but ignoring the results.
Common Structures
Terminate
manyTill :: Stream s m t => ParsecT s u m a -> ParsecT s u m end -> ParsecT s u m [a] Source
manyTill p end
applies parser p
zero or more times until
parser end
succeeds. Returns the list of values returned by p
.
The end
parser does not consume input, c.f. manyThru
.
This parser can be used to scan comments:
simpleComment = do { string "//" ; anyChar `manyTill` char '\n' }
Note that despite the overlapping parsers anyChar
and char '\n'
,
there is never a need to add a try
: the end
parser does not consume
input on failure.
manyThru :: Stream s m t => ParsecT s u m a -> ParsecT s u m end -> ParsecT s u m [a] Source
manyThru p end
applies parser p
zero or more times until
parser end
succeeds. Returns the list of values returned by p
.
The end
parser does consume input, c.f. manyTill
, but is not
included in the result.
This parser can be used to scan comments:
simpleComment = do { string "<!--" ; anyChar `manyThru` string "-->" }
Note that despite the overlapping parsers anyChar
and string "-->"
,
there is no need to add a try
: the end
parser does not consume input
on failure.
chomp :: Stream s m t => ParsecT s u m a -> ParsecT s u m trash -> ParsecT s u m a Source
chomp p x
will parse p
, then, provided x
succeeds, discard a
subsequent parse of x
.
This combinator will only fail when p
fails, not when x
does.
Surround
between :: Stream s m t => ParsecT s u m open -> ParsecT s u m close -> ParsecT s u m a -> ParsecT s u m a
between open close p
parses open
, followed by p
and close
.
Returns the value returned by p
.
braces = between (symbol "{") (symbol "}")
between2 :: Stream s m t => ParsecT s u m around -> ParsecT s u m a -> ParsecT s u m a Source
between2 p q
is equivalent to between
p p q
double_quoted = between2 (char '"')
Intercalate
sepBy :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a] Source
sepBy p sep
parses zero or more occurrences of p
, separated
by sep
. Returns a list of values returned by p
.
commaSep p = p `sepBy` (symbol ",")
sepBy1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a] Source
sepBy1 p sep
parses one or more occurrences of p
, separated
by sep
. Returns a list of values returned by p
.
sepEndBy :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a] Source
sepEndBy p sep
parses zero or more occurrences of p
,
separated and optionally ended by sep
.
haskellStatements = haskellStatement `sepEndBy` semi
sepEndBy1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a] Source
sepEndBy1 p sep
parses one or more occurrences of p
,
separated and optionally ended by sep
.
endBy :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a] Source
endBy p sep
parses zero or more occurrences of p
, seperated
and ended by sep
.
cStatements = cStatement `endBy` semi
endBy1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a] Source
endBy1 p sep
parses one or more occurrences of p
, seperated
and ended by sep
.
sepAroundBy :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a] Source
sepAroundBy p sep
parses zero or more occurrences of p
,
separated and optionally starting with and ended by sep
.
sepAroundBy1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m sep -> ParsecT s u m [a] Source
sepAroundBy1 p sep
parses one or more occurrences of p
,
separated and optionally starting with and ended by sep
.
Chaining
chainl :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a -> a -> a) -> a -> ParsecT s u m a Source
chainl p op z
parses zero or more occurrences of p
,
separated by op
. Returns a value obtained by a left associative
application of all functions returned by op
to the values returned
by p
. If there are zero occurrences of p
, the value z
is
returned. C.f. chainr
.
chainl1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a -> a -> a) -> ParsecT s u m a Source
chainl1 p op
parses one or more occurrences of p
,
separated by op
Returns a value obtained by a left associative
application of all functions returned by op
to the values returned
by p
. This parser can for example be used to eliminate left
recursion which typically occurs in expression grammars.
expr = term `chainl1` mulop term = factor `chainl1` addop factor = parens expr <|> integer mulop = do{ symbol "*"; return (*) } <|> do{ symbol "/"; return (div) } addop = do{ symbol "+"; return (+) } <|> do{ symbol "-"; return (-) }
C.f. chainr1
.
chainr :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a -> a -> a) -> a -> ParsecT s u m a Source
chainr p op z
parses zero or more occurrences of p
,
separated by op
Returns a value obtained by a right associative
application of all functions returned by op
to the values returned
by p
. If there are no occurrences of p
, the value z
is
returned. C.f. chainl
.
chainr1 :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a -> a -> a) -> ParsecT s u m a Source
chainr1 p op
parses one or more occurrences of p
,
separated by op
Returns a value obtained by a right associative
application of all functions returned by op
to the values returned
by p
. C.f. chainl1
.
Lookahead
lookAhead :: Stream s m t => ParsecT s u m a -> ParsecT s u m a Source
lookAhead p
parses p
without consuming any input, even if p
fails.
lookAhead_ :: Stream s m t => ParsecT s u m a -> ParsecT s u m () Source
As lookAhead
, but throw away result.
notFollowedBy :: (Stream s m t, Show trash) => ParsecT s u m a -> ParsecT s u m trash -> ParsecT s u m a Source
notFollowedBy p q
parses p
, but only when q
will fail immediately
after parsing p
. Parsing q
never consumes input, and if this
combinator fails, no input is consumed.
This combinator can be used to implement the 'longest match' rule.
For example, when recognizing keywords (for example let
), we want
to make sure that a keyword is not followed by a legal identifier
character, in which case the keyword is actually an identifier
(for example lets
). We can program this behavior as follows:
keywordLet = string "let" `notFollowedBy` alphaNum
atEndOfInput :: (Stream s m t, Show t) => ParsecT s u m Bool Source
endOfInput :: (Stream s m t, Show t) => ParsecT s u m () Source
Succeed only when at the end of the input stream.
Input Stream
allInput :: (Stream s m t, Show t) => ParsecT s u m a -> ParsecT s u m a Source
Uses the passed parser, but succeeds only if it consumes all of the input.
withRemainingInput :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a, s) Source
Parse using the passed parser, but also return the input that was not consumed.
Additional Data
(<?>) :: ParsecT s u m a -> String -> ParsecT s u m a infix 0
The parser p <?> msg
behaves as parser p
, but whenever the
parser p
fails without consuming any input, it replaces expect
error messages with the expect error message msg
.
This is normally used at the end of a set alternatives where we want
to return an error message in terms of a higher level construct
rather than returning all possible characters. For example, if the
expr
parser from the try
example would fail, the error
message is: '...: expecting expression'. Without the (<?>)
combinator, the message would be like '...: expecting "let" or
letter', which is less friendly.
withPosition :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a, SourcePos) Source
Annotate the return value of the passed parser with the position just before parsing.
withPositionEnd :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a, SourcePos) Source
Annotate the return value of the passed parser with the position just after parsing.
withPositions :: Stream s m t => ParsecT s u m a -> ParsecT s u m (SourcePos, a, SourcePos) Source
Annotate the return value of the passed parser with the position just before and after parsing respectively.
Re-exports
try :: ParsecT s u m a -> ParsecT s u m a
The parser try p
behaves like parser p
, except that it
pretends that it hasn't consumed any input when an error occurs.
This combinator is used whenever arbitrary look ahead is needed.
Since it pretends that it hasn't consumed any input when p
fails,
the (<|>
) combinator will try its second alternative even when the
first parser failed while consuming input.
The try
combinator can for example be used to distinguish
identifiers and reserved words. Both reserved words and identifiers
are a sequence of letters. Whenever we expect a certain reserved
word where we can also expect an identifier we have to use the try
combinator. Suppose we write:
expr = letExpr <|> identifier <?> "expression" letExpr = do{ string "let"; ... } identifier = many1 letter
If the user writes "lexical", the parser fails with: unexpected
'x', expecting 't' in "let"
. Indeed, since the (<|>
) combinator
only tries alternatives when the first alternative hasn't consumed
input, the identifier
parser is never tried (because the prefix
"le" of the string "let"
parser is already consumed). The
right behaviour can be obtained by adding the try
combinator:
expr = letExpr <|> identifier <?> "expression" letExpr = do{ try (string "let"); ... } identifier = many1 letter
(<|>) :: ParsecT s u m a -> ParsecT s u m a -> ParsecT s u m a infixr 1
This combinator implements choice. The parser p <|> q
first
applies p
. If it succeeds, the value of p
is returned. If p
fails without consuming any input, parser q
is tried. This
combinator is defined equal to the mplus
member of the MonadPlus
class and the (<|>
) member of Alternative
.
The parser is called predictive since q
is only tried when
parser p
didn't consume any input (i.e.. the look ahead is 1).
This non-backtracking behaviour allows for both an efficient
implementation of the parser combinators and the generation of good
error messages.
unexpected :: Stream s m t => String -> ParsecT s u m a
The parser unexpected msg
always fails with an unexpected error
message msg
without consuming any input.
The parsers fail
, (<?>
) and unexpected
are the three parsers
used to generate error messages. Of these, only (<?>
) is commonly
used. For an example of the use of unexpected
, see the definition
of notFollowedBy
.