little-earley: Simple implementation of Earley parsing

[ library, mit, parsing ] [ Propose Tags ]

A little Earley parser.

Also some utilities for visualizing parse trees.

See README.


[Skip to Readme]
Versions [RSS] [faq] 0.1.0.0, 0.2.0.0
Change log CHANGELOG.md
Dependencies base (>=4.14.1.0 && <4.17), containers, mtl [details]
License MIT
Copyright 2021 Li-yao Xia
Author Li-yao Xia
Maintainer lysxia@gmail.com
Revised Revision 1 made by lyxia at 2021-11-13T18:24:59Z
Category Parsing
Bug tracker https://gitlab.com/lysxia/little-earley/-/issues
Source repo head: git clone https://gitlab.com/lysxia/little-earley
Uploaded by lyxia at 2021-05-29T17:42:37Z
Distributions NixOS:0.2.0.0
Downloads 119 total (38 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Hackage Matrix CI
Docs available [build log]
Last success reported on 2021-05-29 [all 1 reports]

Modules

[Index] [Quick Jump]

Downloads

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

For package maintainers and hackage trustees

Candidates


Readme for little-earley-0.2.0.0

[back to package description]

A little Earley parser

Parser for context-free grammars.

Example

The following module defines a grammar for arithmetic expressions (like 1+2*3). It defines:

  • A type of non-terminal symbols Ns.
  • A type of terminal symbols Ts.
  • The production rules of the grammar arithRules.
  • A matching function, which interprets terminal symbols Ts as sets of lexemes, of type Char here.
  • A grammar arithG packaging up all of the above.
import Data.Char (isDigit)
import Little.Earley

data Ns = SUM | PRODUCT | FACTOR | NUMBER deriving (Eq, Ord, Enum, Bounded, Show)
data Ts = Digit | OneOf [Char] deriving (Eq, Ord, Show)

arithRules :: Ns -> [Rule Ns Ts]
arithRules n = case n of
  SUM ->
    [ [ N PRODUCT ]
    , [ N SUM, T (OneOf ['+', '-']), N PRODUCT ] ]
  PRODUCT ->
    [ [ N FACTOR ]
    , [ N PRODUCT, T (OneOf ['*', '/']), N FACTOR ] ]
  FACTOR ->
    [ [ N NUMBER ]
    , [ T (OneOf ['(']), N SUM, T (OneOf [')']) ] ]
  NUMBER ->
    [ [ T Digit ]
    , [ T Digit, N NUMBER ] ]

matchTs :: Ts -> Char -> Bool
matchTs Digit = isDigit
matchTs (OneOf s) = (`elem` s)

arithG :: Grammar Ns Ts Char
arithG = mkGrammar arithRules matchTs

Load that file in GHCi:

ghci -package little-earley example.hs

Parse a string using that grammar:

> pparse arithG SUM "1+2*3"

Output:

     +-----+--SUM #1---+
     |     |           |
  SUM #0   |      +PRODUCT #1-+
     |     |      |     |     |
PRODUCT #0 | PRODUCT #0 |     |
     |     |      |     |     |
 FACTOR #0 |  FACTOR #0 | FACTOR #0
     |     |      |     |     |
 NUMBER #0 |  NUMBER #0 | NUMBER #0
     |     |      |     |     |
-----------------------------------
     1     +      2     *     3

The module Little.Earley.Examples contains more examples of grammars using this library.

  • This Earley parser implementation is based on Loup Vaillant's tutorial.
  • See also the Earley library also in Haskell, with much fancier types to encode arbitrary semantic actions instead of clumsy trees.