Safe Haskell | Safe-Inferred |
---|---|
Language | Haskell98 |
A parser-combinator library.
The primary goal in writing Parcom was to facilitate parsing Unicode string
data from various source streams, including raw
ByteString
s - while Attoparsec can parse
ByteString
s, it sacrifices some convenience for
performance, and using it to parse textual data is not as comfortable as I
would like; Parsec can handle textual data much better, but it needs the
input to be converetd to Unicode for this to work nicely. Nonetheless,
Parcom's interface is quite obviously heavily inspired by both Parsec and
Attoparsec.
Parcom supports String
, ByteString
(lazy and strict) and
Text
(lazy and strict) as its input format out-of-the-box. By
implementing one or more of the typeclasses in Stream
, you can
extend Parcom to work on other input types as well.
- module Text.Parcom.Prim
- module Text.Parcom.Combinators
- module Text.Parcom.Core
Documentation
module Text.Parcom.Prim
module Text.Parcom.Combinators
module Text.Parcom.Core
Getting Started
Parcom being a parser combinator library, the usual approach is to use predefined atomic parsers (defined in Text.Parcom.Prim and re-exported here for convenience) and combine them using predefined combinators (defined in Text.Parcom.Combinators). Anyone with prior exposure to Parsec or Attoparsec should be familiar with the concept. Here's an example that parses a value, which can be a positive integer literal or NULL:
myParser :: Parcom String Char (Maybe Int) myParser = intLiteral <|> nullLiteral <?> "value (integer or NULL)" intLiteral :: Parcom String Char (Maybe Int) intLiteral = do x <- oneOf ['1'..'9'] xs <- many (oneOf ['0'..'9']) return $ Just $ read (x:xs) nullLiteral :: Parcom String Char (Maybe Int) nullLiteral = do tokens "NULL" notFollowedBy (satisfy (not . isSpace)) return Nothing
Such a parser can then be run against some input using parse
or parseT
,
the monadic equivalent.
main = do src <- getContents let parseResult = parse myParser "<STDIN>" src case parseResult of Left err -> do putStrLn "Sorry, there has been an error, namely:" print err Right (Just i) -> putStrLn $ "Found an integer value: " ++ show i Right Nothing -> putStrLn "Found NULL"
Backtracking
As you build more complex parsers, you may encounter situations where a
parser fails after having consumed some input already. Combining such a
parser with other alternatives will yield undesired results: the parser
fails, but it will not push the input it has already consumed back onto the
input stream. To fix this, use the try
primitive, which modifies a parser
such that when it fails, it undoes any input consumption it may have caused.
Input types other than String
To support input from Text
or ByteString
s, import one of the following
modules:
Text.Parcom.Text or Text.Parcom.Text.Strict for strict
Text
- Text.Parcom.Text.Lazy for lazy
Text
- Text.Parcom.ByteString or ByteString.Parcom.ByteString.Strict for strict
ByteString
- Text.Parcom.ByteString.Lazy for lazy
ByteString
- Text.Parcom.Text.Lazy for lazy