hlex-0.1.0: Simple Lexer Creation
Copyright(c) Sebastian Tee 2023
LicenseMIT
Safe HaskellSafe-Inferred
LanguageHaskell2010

Hlex

Description

Tools needed to create a Lexer from a lexical Grammar.

Synopsis

Example

Here is an example module for a simple language.

  module ExampleLang
       ( MyToken(..) -- Export the language's tokens and the lexer
       , myLexer
       ) where

  import Hlex

  data MyToken = Ident String -- String identifier token
               | Number Float -- Number token and numeric value
               | Assign       -- Assignment operator token
               deriving(Show)

  myGrammar :: Grammar MyToken
  myGrammar = [ JustToken "=" Assign                                       -- "=" Operator becomes the assign token
              , Tokenize "[a-zA-Z]+" (match -> Ident match)                -- Identifier token with string
              , Tokenize "[0-9]+(\.[0-9]+)?" (match -> Number (read match) -- Number token with the parsed numeric value stored as a Float
              , Skip "[ \n\r\t]+"                                          -- Skip whitespace
              ]

  myLexer :: Lexer MyToken
  myLexer = hlex myGrammar -- hlex turns a Grammar into a Lexer

Here is the lexer being used on a simple program.

>>> lexer "x = 1.2"
Right [Ident "x", Assign, Number 1.2]

The lexer uses Either. Right means the lexer successfully parsed the program to a list of MyTokens. If Left was returned it would be a LexException.

Types

type Grammar token = [TokenSyntax token] Source #

Lexical grammar made up of TokenSyntax rules.

The order is important. The Lexer will apply each TokenSyntax rule in the order listed.

data TokenSyntax token Source #

These are the individual rules that make up a Grammar.

Takes a POSIX regular expression then converts it to a token or skips it.

Constructors

Skip

Skips over any matches.

Fields

Tokenize

Takes a function that converts the matched string to a token.

Fields

  • String

    Regular expression.

  • (String -> token)

    Function that converts the matched string into a token.

JustToken

Converts any regular expression matches to a given token.

Fields

  • String

    Regular expression.

  • token

    Given token.

type Lexer token = String -> Either LexException [token] Source #

Converts a string into a list of tokens. If the string does not follow the Lexer's Grammar a LexException will be returned.

Exceptions

data LexException Source #

Exception thrown when a Lexer is unable to lex a string.

Constructors

LexException 

Fields

  • Int

    The line number where the string that couldn't be lexed is located.

  • Int

    The column where the string that couldn't be lexed is located.

  • String

    The String that couldn't be lexed.

Instances

Instances details
Read LexException Source # 
Instance details

Defined in Hlex

Show LexException Source # 
Instance details

Defined in Hlex

Eq LexException Source # 
Instance details

Defined in Hlex

Functions

hlex :: Grammar token -> Lexer token Source #

Takes a given Grammar and turns it into a Lexer.