parser-regex: Regex based parsers

[ bsd3, library, parsing ] [ Propose Tags ]

Regex based parsers.


[Skip to Readme]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1.0.0
Change log CHANGELOG.md
Dependencies base (>=4.15 && <5.0), bytestring (>=0.10.12 && <0.13), containers (>=0.6.4 && <0.8), deepseq (>=1.4.5 && <1.6), ghc-bignum (>=1.1 && <1.4), primitive (>=0.7.3 && <0.10), text (>=2.0.1 && <2.2), transformers (>=0.5.6 && <0.7) [details]
License BSD-3-Clause
Author Soumik Sarkar
Maintainer soumiksarkar.3120@gmail.com
Category Parsing
Home page https://github.com/meooow25/parser-regex
Bug tracker https://github.com/meooow25/parser-regex/issues
Source repo head: git clone https://github.com/meooow25/parser-regex.git
Uploaded by meooow at 2024-03-04T16:19:42Z
Distributions
Downloads 22 total (6 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2024-03-04 [all 1 reports]

Readme for parser-regex-0.1.0.0

[back to package description]

parser-regex

Hackage Haskell-CI

Regex based parsers

Features

  • Parsers based on regular expressions, capable of parsing regular languages. There are no extra features that would make parsing non-regular languages possible.
  • Regexes are composed using combinators.
  • Resumable parsing of sequences of any type containing values of any type.
  • Special support for Text and String in the form of convenient combinators and operations like find and replace.
  • Parsing runtime is linear in the length of the sequence being parsed. No exponential backtracking.

Example

{-# LANGUAGE OverloadedStrings #-}
import Control.Applicative (optional)
import Data.Text (Text)

import Regex.Text (REText)
import qualified Regex.Text as R
import qualified Data.CharSet as CS

data URI = URI
  { scheme    :: Maybe Text
  , authority :: Maybe Text
  , path      :: Text
  , query     :: Maybe Text
  , fragment  :: Maybe Text
  } deriving Show

-- ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
-- A non-validating regex to extract parts of a URI, from RFC 3986
-- Translated:
uriRE :: REText URI
uriRE = URI
  <$> optional (R.someTextOf (CS.not ":/?#") <* R.char ':')
  <*> optional (R.text "//" *> R.manyTextOf (CS.not "/?#"))
  <*> R.manyTextOf (CS.not "?#")
  <*> optional (R.char '?' *> R.manyTextOf (CS.not "#"))
  <*> optional (R.char '#' *> R.manyText)
>>> R.reParse uriRE "https://github.com/meooow25/parser-regex?tab=readme-ov-file#parser-regex"
Just (URI { scheme = Just "https"
          , authority = Just "github.com"
          , path = "/meooow25/parser-regex"
          , query = Just "tab=readme-ov-file"
          , fragment = Just "parser-regex" })

Documentation

Please find the documentation on Hackage: parser-regex

Already familiar with regex patterns? See the Regex pattern cheat sheet.

Alternatives

regex-applicative

regex-applicative is the primary inspiration for this library, and provides a similar set of features. parser-regex attempts to be a more fully-featured library built on the ideas of regex-applicative.

Traditional regex libraries

Other alternatives are more traditional regex libraries that use regex patterns, like regex-tdfa and regex-pcre/ regex-pcre-builtin.

Reasons to use parser-regex over traditional regex libraries:

  • You prefer parser combinators over regex patterns
  • You need more powerful parsing capabilities than just submatch extraction
  • You need to parse a sequence type that is not supported by these regex libraries

Reasons to use traditional regex libraries over parser-regex:

  • The terseness of regex patterns is better suited for your use case
  • You need something very fast, and adversarial input is not a concern. Use regex-pcre/regex-pcre-builtin.

For a more detailed comparison of regex libraries, see here.

Contributing

Questions, bug reports, documentation improvements, code contributions welcome! Please open an issue as the first step.