FormalGrammars-0.2.1.0: (Context-free) grammars in formal language theory

Safe HaskellNone
LanguageHaskell2010

FormalLanguage.CFG.Grammar.Types

Description

The data types that define a CFG.

Synopsis

Documentation

data IOP Source

Constructors

IPlus 
IMinus 
IEq 
INone 
ISymbol 

Instances

data Index Source

Encode the index of the syntactic or terminal variable.

In case of grammar-based indexing, keep indexRange empty. The indexStep keeps track of any +k / -k given in the production rules.

We allow indexing terminals now, too. When glueing together terminals, one might want to be able to differentiate between terminals.

newtype SymbolName Source

Newtype wrapper for symbol names.

Constructors

SymbolName 

Fields

_getSteName :: String
 

newtype Tape Source

The tape, a terminal operates on. Terminals on different tapes could still have the same SymbolName but different type and input!

Constructors

Tape 

Fields

_getTape :: Int
 

data SynTermEps Source

Symbols, potentially with an index or more than one.

Constructors

SynVar

Syntactic variables.

SynTerm 

Fields

_name :: SymbolName
 
_index :: [Index]
 
Term

Regular old terminal symbol -- reads stuff from the input.

Fields

_name :: SymbolName
 
_index :: [Index]
 
Deletion

This sym denotes the case, where we have an Deletion terminal, i.e. something is matched to nothing. This is actually just a regular terminal symbol, we just treat it differently.

Epsilon

Finally, a real epsilon. Again, these are somewhat regular terminal symbols, but it is important to be able to recognize these, when trying to create outside variants of our algorithms.

newtype Symbol Source

The length of the list encodes the dimension of the symbol. Forms a monoid over dimensional concatenation.

Constructors

Symbol 

data Rule Source

Production rules for at-most CFGs.

Constructors

Rule 

Fields

_lhs :: Symbol

the left-hand side of the rule

_attr :: [AttributeFunction]

the attribute for this rule

_rhs :: [Symbol]

the right-hand side with a collection of terminals and syntactic variables

data Grammar Source

Complete descrition of a grammar. In principle it would be enough to hold _rules and the _start symbol name. We also store dimensionless names for syntactiv variables, and terminals. This makes certain checks easier or possible.

We store all single-tape symbol names dimensionless. This means that, for terminals, symbols with the same name have the same tape. This is slightly inconvenient for special applications (say Protein-DNA alignment) but one can easily rename terminals.

TODO better way to handle indexed symbols?

Constructors

Grammar 

Fields

_synvars :: Map SymbolName SynTermEps

regular syntactic variables, without dimension

_synterms :: Map SymbolName SynTermEps

Terminal synvars are somewhat weird. They are used in Outside grammars, and hold previously calculated inside values.

_termvars :: Map SymbolName SynTermEps

regular terminal symbols

_outside :: DerivedGrammar

Is this an automatically derived outside grammar

_rules :: Set Rule

set of production rules

_start :: Symbol

start symbol

_params :: Map IndexName Index

any global variables

_indices :: Map IndexName Index

active indices

_grammarName :: String

grammar name

_write :: Bool

some grammar file requested this grammar to be expanded into code -- TODO remove, we have an emission queue