Safe Haskell | Safe |
---|---|
Language | Haskell2010 |
Kleene algebra.
This package provides means to work with kleene algebra,
at the moment specifically concentrating on regular expressions over Char
.
Implements ideas from Regular-expression derivatives re-examined by Scott Owens, John Reppy and Aaron Turon https://doi.org/10.1017/S0956796808007090.
>>>
:set -XOverloadedStrings
>>>
import Algebra.Lattice
>>>
import Algebra.PartialOrd
>>>
import Data.Semigroup
>>>
import Kleene.Internal.Pretty (putPretty)
Kleene.RE module provides RE
type. Kleene.Classes module provides various
classes to work with the type. All of that is re-exported from Kleene module.
First let's construct a regular expression value:
>>>
let re = star "abc" <> "def" <> ("x" \/ "yz") :: RE Char
>>>
putPretty re
^(abc)*def(x|yz)$
We can convert it to DFA
(there are 8 states)
>>>
putPretty $ fromTM re
0 -> \x -> if | x <= '`' -> 8 | x <= 'a' -> 5 | x <= 'c' -> 8 | x <= 'd' -> 3 | otherwise -> 8 1 -> \x -> if | x <= 'w' -> 8 | x <= 'x' -> 6 | x <= 'y' -> 7 | otherwise -> 8 2 -> ... ...
And we can convert back from DFA
to RE
:
>>>
let re' = toKleene (fromTM re) :: RE Char
>>>
putPretty re'
^(a(bca)*bcdefx|defx|(a(bca)*bcdefy|defy)z)$
As you see, we don't get what we started with. Yet, these
regular expressions are equivalent
;
>>>
equivalent re re'
True
or using Equiv
wrapper
>>>
Equiv re == Equiv re'
True
(The paper doesn't outline decision procedure for the equivalence, though it's right there - seems to be fast enough at least for toy examples like here).
We can use regular expressions to generate word examples in the language:
>>>
import Data.Foldable
>>>
import qualified Test.QuickCheck as QC
>>>
import Kleene.RE (generate)
>>>
traverse_ print $ take 5 $ generate (curry QC.choose) 42 re
"abcabcabcabcabcabcdefyz" "abcabcabcabcdefyz" "abcabcabcabcabcabcabcabcabcdefx" "abcabcdefx" "abcabcabcabcabcabcdefyz"
In addition to the "normal" regular expressions, there are extended regular expressions.
Regular expressions which we can complement
, and therefore intersect:
>>>
let ere = star "aa" /\ star "aaa" :: ERE Char
>>>
putPretty ere
^~(~((aa)*)|~((aaa)*))$
We can convert ERE
to RE
via DFA
:
>>>
let re'' = toKleene (fromTM ere) :: RE Char
>>>
putPretty re''
^(a(aaaaaa)*aaaaa)?$
Machine works own ways, we don't (always) get as pretty results as we'd like:
>>>
equivalent re'' (star "aaaaaa")
True
Another feature of the library is an Applciative
Functor
,
>>>
import Control.Applicative
>>>
import qualified Kleene.Functor as F
>>>
let f = (,) <$> many (F.char 'x') <* F.few F.anyChar <*> many (F.char 'z')
>>>
putPretty f
^x*[^]*z*$
By relying on http://hackage.haskell.org/package/regex-applicative library, we can match and capture with regular expression.
>>>
F.match f "xyyzzz"
Just ("x","zzz")
Where with RE
we can only get True
or False
:
>>>
match (F.toRE f) "xyyzzz"
True
Which in this case is not even interesting because:
>>>
equivalent (F.toRE f) everything
True
Converting from RE
to K
is also possible, which may be handy:
>>>
let g = (,) <$> F.few F.anyChar <*> F.fromRE re''
>>>
putPretty g
^[^]*(a(aaaaaa)*aaaaa)?$
>>>
F.match g (replicate 20 'a')
Just ("aa","aaaaaaaaaaaaaaaaaa")
We got longest divisible by 6 prefix of as. That's because fromRE
uses many
for star
.
- data RE c
- data ERE c
- newtype Equiv r c = Equiv (r c)
- data DFA c = DFA {
- dfaTransition :: !(IntMap (SF c Int))
- dfaAcceptable :: !IntSet
- dfaBlackholes :: !IntSet
- fromTM :: forall k c. (Ord k, Ord c, TransitionMap c k) => k -> DFA c
- fromTMEquiv :: forall k c. (Ord k, Ord c, TransitionMap c k, Equivalent c k) => k -> DFA c
- toKleene :: forall k c. (Ord c, Enum c, Bounded c, FiniteKleene c k) => DFA c -> k
- class (BoundedJoinSemiLattice k, Semigroup k, Monoid k) => Kleene c k | k -> c where
- class Derivate c k | k -> c where
- class Match c k | k -> c where
- class Derivate c k => TransitionMap c k | k -> c where
- class Complement c k | k -> c where
- data K c a
Regular expressions
Regular expression
Constructors are exposed, but you should use
smart constructors in this module to construct RE
.
The Eq
and Ord
instances are structural.
The Kleene
etc constructors do "weak normalisation", so for values
constructed using those operations Eq
witnesses "weak equivalence".
See equivalent
for regular-expression equivalence.
Structure is exposed in Kleene.RE module but consider constructors as
half-internal. There are soft-invariants, but violating them shouldn't
break anything in the package. (e.g. transitionMap
will eventually
terminate, but may create more redundant states if starting regexp is not
"weakly normalised").
(Ord c, Enum c, Bounded c) => TransitionMap c (RE c) Source # | |
(Ord c, Enum c, Bounded c) => Equivalent c (RE c) Source # | |
(Ord c, Enum c, Bounded c) => Match c (RE c) Source # | |
(Ord c, Enum c, Bounded c) => Derivate c (RE c) Source # | |
(Ord c, Enum c, Bounded c) => FiniteKleene c (RE c) Source # | |
(Ord c, Enum c, Bounded c) => Kleene c (RE c) Source # | |
Eq c => Eq (RE c) Source # | |
Ord c => Ord (RE c) Source # | |
Show c => Show (RE c) Source # | |
(~) * c Char => IsString (RE c) Source # | |
Eq c => Semigroup (RE c) Source # | |
Eq c => Monoid (RE c) Source # | |
(Ord c, Enum c, Bounded c, Arbitrary c) => Arbitrary (RE c) Source # | |
CoArbitrary c => CoArbitrary (RE c) Source # | |
(Ord c, Enum c, Bounded c) => JoinSemiLattice (RE c) Source # | |
(Ord c, Enum c, Bounded c) => BoundedJoinSemiLattice (RE c) Source # | |
(~) * c Char => Pretty (RE c) Source # | |
Extended regular expression
It's both, Kleene and Boolean algebra. (If we add only intersections, it wouldn't be Boolean).
Note: we don't have special constructor for intersections. We use de Morgan formula \(a \land b = \neg (\neg a \lor \neg b)\).
>>>
putPretty $ asEREChar $ "a" /\ "b"
^~(~a|~b)$
There is no generator, as intersections
makes it hard.
Equivalance (and partial order)
Regular-expressions for which ==
is equivalent
.
>>>
let re1 = star "a" <> "a" :: RE Char
>>>
let re2 = "a" <> star "a" :: RE Char
>>>
re1 == re2
False
>>>
Equiv re1 == Equiv re2
True
Equiv
is also a PartialOrd
(but not Ord
!)
>>>
Equiv "a" `leq` Equiv (star "a" :: RE Char)
True
Not all regular expessions are comparable
:
>>>
let reA = Equiv "a" :: Equiv RE Char
>>>
let reB = Equiv "b" :: Equiv RE Char
>>>
(leq reA reB, leq reB reA)
(False,False)
Equiv (r c) |
Complement c (r c) => Complement c (Equiv r c) Source # | |
Equivalent c (r c) => Equivalent c (Equiv r c) Source # | |
Match c (r c) => Match c (Equiv r c) Source # | |
Derivate c (r c) => Derivate c (Equiv r c) Source # | |
Kleene c (r c) => Kleene c (Equiv r c) Source # | |
Equivalent c (r c) => Eq (Equiv r c) Source # | |
Show (r c) => Show (Equiv r c) Source # | |
Semigroup (r c) => Semigroup (Equiv r c) Source # | |
Monoid (r c) => Monoid (Equiv r c) Source # | |
JoinSemiLattice (r c) => JoinSemiLattice (Equiv r c) Source # | |
BoundedJoinSemiLattice (r c) => BoundedJoinSemiLattice (Equiv r c) Source # | |
(JoinSemiLattice (r c), Equivalent c (r c)) => PartialOrd (Equiv r c) Source # | \(a \preceq b := a \lor b = b \) |
Pretty (r c) => Pretty (Equiv r c) Source # | |
Deterministic finite automaton
Deterministic finite automaton.
A deterministic finite automaton (DFA) over an alphabet \(\Sigma\) (type
variable c
) is 4-tuple \(Q\), \(q_0\) , \(F\), \(\delta\), where
- \(Q\) is a finite set of states (subset of
Int
), - \(q_0 \in Q\) is the distinguised start state (
0
), - \(F \subset Q\) is a set of final (or accepting) states (
dfaAcceptable
), and - \(\delta : Q \times \Sigma \to Q\) is a function called the state
transition function (
dfaTransition
).
DFA | |
|
Complement c (DFA c) Source # | Complement DFA. Complement of
|
Ord c => Match c (DFA c) Source # | Run Because we have analysed a language, in some cases we can determine an input
without traversing all of the input.
That's not the cases with
Holds:
all (match (fromRE r)) $ take 10 $ RE.generate (curry QC.choose) 42 (r :: RE.RE Char) |
Show c => Show (DFA c) Source # | |
Show c => Pretty (DFA c) Source # | |
fromTM :: forall k c. (Ord k, Ord c, TransitionMap c k) => k -> DFA c Source #
Create from TransitionMap
.
See fromRE
for a specific example.
fromTMEquiv :: forall k c. (Ord k, Ord c, TransitionMap c k, Equivalent c k) => k -> DFA c Source #
Create from TransitonMap
minimising states with Equivalent
.
See fromERE
for an example.
Classes
Most operations are defined in following type-classes.
See Kleene.RE module for a specific version with examples.
class (BoundedJoinSemiLattice k, Semigroup k, Monoid k) => Kleene c k | k -> c where Source #
Empty regex. Doesn't accept anything.
Empty string. Note: different than empty
Single character
Concatenation.
Union.
Kleene star
class Match c k | k -> c where Source #
An f
can be used to match on the input.
(Ord c, Enum c) => Match c (ERE c) Source # | |
(Eq c, Enum c, Bounded c) => Match c (M c) Source # | |
(Ord c, Enum c, Bounded c) => Match c (RE c) Source # | |
Ord c => Match c (DFA c) Source # | Run Because we have analysed a language, in some cases we can determine an input
without traversing all of the input.
That's not the cases with
Holds:
all (match (fromRE r)) $ take 10 $ RE.generate (curry QC.choose) 42 (r :: RE.RE Char) |
Match c (r c) => Match c (Equiv r c) Source # | |
class Derivate c k => TransitionMap c k | k -> c where Source #
Transition map.
transitionMap :: k -> Map k (SF c k) Source #
class Complement c k | k -> c where Source #
complement :: k -> k Source #
Complement c (ERE c) Source # | |
Complement c (DFA c) Source # | Complement DFA. Complement of
|
Complement c (r c) => Complement c (Equiv r c) Source # | |
Functor
Only the type is exported so it can be referred to.
See Kleene.Functor for operations.
Applicative
Functor
regular expression.
Functor (K c) Source # | |
Applicative (K c) Source # | |
Alternative (K c) Source # | |
((~) * c Char, IsString a) => IsString (K c a) Source # | |
(~) * c Char => Pretty (K c a) Source # | Convert to non-matching JavaScript string which can be used
as an argument to
|