inchworm: Simple parser combinators for lexical analysis.

[ library, mit, parsing ] [ Propose Tags ] [ Report a vulnerability ]

Parser combinator framework specialized to lexical analysis. Tokens are specified via simple fold functions, and we include baked in source location handling. Comes with matchers for standard lexemes like integers, comments, and Haskell style strings with escape handling. No dependencies other than the Haskell base library. If you want to parse expressions instead of tokens then try try the parsec or attoparsec packages, which have more general purpose combinators.


[Skip to Readme]

Modules

[Last Documentation]

  • Text
    • Lexer
      • Text.Lexer.Inchworm
        • Text.Lexer.Inchworm.Char
        • Text.Lexer.Inchworm.Scanner
        • Text.Lexer.Inchworm.Source

Downloads

Note: This package has metadata revisions in the cabal description newer than included in the tarball. To unpack the package including the revisions, use 'cabal get'.

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

Versions [RSS] 1.0.0.1, 1.0.1.1, 1.0.2.1, 1.0.2.2, 1.0.2.3, 1.0.2.4, 1.1.1.1, 1.1.1.2
Change log Changelog.md
Dependencies base (>=4.8 && <4.18) [details]
License MIT
Author The Inchworm Development Team
Maintainer Ben Lippmeier <benl@ouroborus.net>
Revised Revision 2 made by BenLippmeier at 2024-05-02T10:34:03Z
Category Parsing
Home page https://github.com/discus-lang/inchworm
Source repo head: git clone https://github.com/discus-lang/inchworm.git
Uploaded by BenLippmeier at 2019-01-02T03:42:31Z
Distributions
Reverse Dependencies 1 direct, 12 indirect [details]
Downloads 4630 total (4 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs not available [build log]
Last success reported on 2019-01-02 [all 2 reports]

Readme for inchworm-1.1.1.2

[back to package description]

Inchworm

Inchworm is a simple parser combinator framework specialized to lexical analysis. Tokens are specified via simple fold functions, and we include baked in source location handling.

Matchers for standard tokens like comments and strings are in the Text.Lexer.Inchworm.Char module.

No dependencies other than the Haskell base library.

If you want to parse expressions instead of performing lexical analysis then try the parsec or attoparsec packages, which have more general purpose combinators.

Minimal example

The following code demonstrates how to perform lexical analysis of a simple LISP-like language. We use two separate name classes, one for variables that start with a lower-case letter, and one for constructors that start with an upper case letter.

Integers are scanned using the scanInteger function from the Text.Lexer.Inchworm.Char module.

The result of scanStringIO contains the list of leftover input characters that could not be parsed. In a real lexer you should check that this is empty to ensure there has not been a lexical error.

import Text.Lexer.Inchworm.Char
import qualified Data.Char as Char

-- | A source token.
data Token 
        = KBra | KKet | KVar String | KCon String | KInt Integer
        deriving Show

-- | A thing with attached location information.
data Located a
        = Located FilePath (Range Location) a
        deriving Show

-- | Scanner for a lispy language.
scanner :: FilePath
        -> Scanner IO Location [Char] (Located Token)
scanner fileName
 = skip Char.isSpace
 $ alts [ fmap (stamp id)   $ accept '(' KBra
        , fmap (stamp id)   $ accept ')' KKet
        , fmap (stamp KInt) $ scanInteger 
        , fmap (stamp KVar)
          $ munchWord (\ix c -> if ix == 0 then Char.isLower c
                                           else Char.isAlpha c) 
        , fmap (stamp KCon) 
          $ munchWord (\ix c -> if ix == 0 then Char.isUpper c
                                           else Char.isAlpha c)
        ]
 where  -- Stamp a token with source location information.
        stamp k (range, t) 
          = Located fileName range (k t)

main :: IO ()
main 
 = do   let fileName = "Source.lispy"
        let source   = "(some (Lispy like) 26 Program 93 (for you))"
        let toks     = scanString source (scanner fileName)
        print toks