sbv-8.9: SMT Based Verification: Symbolic Haskell theorem prover using SMT solving.
Copyright(c) Levent Erkok
LicenseBSD3
Maintainererkokl@gmail.com
Stabilityexperimental
Safe HaskellNone
LanguageHaskell2010

Data.SBV.RegExp

Description

A collection of regular-expression related utilities. The recommended workflow is to import this module qualified as the names of the functions are specifically chosen to be common identifiers. Also, it is recommended you use the OverloadedStrings extension to allow literal strings to be used as symbolic-strings and regular-expressions when working with this module.

Synopsis

Regular expressions

data RegExp Source #

Regular expressions. Note that regular expressions themselves are concrete, but the match function from the RegExpMatchable class can check membership against a symbolic string/character. Also, we are preferring a datatype approach here, as opposed to coming up with some string-representation; there are way too many alternatives already so inventing one isn't a priority. Please get in touch if you would like a parser for this type as it might be easier to use.

Constructors

Literal String

Precisely match the given string

All

Accept every string

None

Accept no strings

Range Char Char

Accept range of characters

Conc [RegExp]

Concatenation

KStar RegExp

Kleene Star: Zero or more

KPlus RegExp

Kleene Plus: One or more

Opt RegExp

Zero or one

Loop Int Int RegExp

From n repetitions to m repetitions

Union [RegExp]

Union of regular expressions

Inter RegExp RegExp

Intersection of regular expressions

Instances

Instances details
Eq RegExp Source # 
Instance details

Defined in Data.SBV.Core.Symbolic

Methods

(==) :: RegExp -> RegExp -> Bool #

(/=) :: RegExp -> RegExp -> Bool #

Data RegExp Source # 
Instance details

Defined in Data.SBV.Core.Symbolic

Methods

gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> RegExp -> c RegExp #

gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c RegExp #

toConstr :: RegExp -> Constr #

dataTypeOf :: RegExp -> DataType #

dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c RegExp) #

dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c RegExp) #

gmapT :: (forall b. Data b => b -> b) -> RegExp -> RegExp #

gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> RegExp -> r #

gmapQr :: forall r r'. (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> RegExp -> r #

gmapQ :: (forall d. Data d => d -> u) -> RegExp -> [u] #

gmapQi :: Int -> (forall d. Data d => d -> u) -> RegExp -> u #

gmapM :: Monad m => (forall d. Data d => d -> m d) -> RegExp -> m RegExp #

gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> RegExp -> m RegExp #

gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> RegExp -> m RegExp #

Num RegExp Source #

Regular expressions as a Num instance. Note that only + (union) and * (concatenation) make sense.

Instance details

Defined in Data.SBV.Core.Symbolic

Ord RegExp Source # 
Instance details

Defined in Data.SBV.Core.Symbolic

Show RegExp Source #

Show instance for RegExp. The mapping is done so the outcome matches the SMTLib string reg-exp operations

Instance details

Defined in Data.SBV.Core.Symbolic

IsString RegExp Source #

With overloaded strings, we can have direct literal regular expressions.

Instance details

Defined in Data.SBV.Core.Symbolic

Methods

fromString :: String -> RegExp #

Matching

A symbolic string or a character (SString or SChar) can be matched against a regular-expression. Note that the regular-expression itself is not a symbolic object: It's a fully concrete representation, as captured by the RegExp class. The RegExp class is an instance of the IsString class, which makes writing literal matches easier. The RegExp type also has a (somewhat degenerate) Num instance: Concatenation corresponds to multiplication, union corresponds to addition, and 0 corresponds to the empty language.

Note that since match is a method of RegExpMatchable class, both SChar and SString can be used as an argument for matching. In practice, this means you might have to disambiguate with a type-ascription if it is not deducible from context.

>>> prove $ \s -> (s :: SString) `match` "hello" .<=> s .== "hello"
Q.E.D.
>>> prove $ \s -> s `match` Loop 2 5 "xyz" .=> length s .>= 6
Q.E.D.
>>> prove $ \s -> s `match` Loop 2 5 "xyz" .=> length s .<= 15
Q.E.D.
>>> prove $ \s -> match s (Loop 2 5 "xyz") .=> length s .>= 7
Falsifiable. Counter-example:
  s0 = "xyzxyz" :: String
>>> prove $ \s -> (s :: SString) `match` "hello" .=> s `match` ("hello" + "world")
Q.E.D.
>>> prove $ \s -> sNot $ (s::SString) `match` ("so close" * 0)
Q.E.D.
>>> prove $ \c -> (c :: SChar) `match` oneOf "abcd" .=> ord c .>= ord (literal 'a') .&& ord c .<= ord (literal 'd')
Q.E.D.

class RegExpMatchable a where Source #

Matchable class. Things we can match against a RegExp. (TODO: Currently SBV does *not* optimize this call if the input is a concrete string or a character, but rather directly calls down to the solver. We might want to perform the operation on the Haskell side for performance reasons, should this become important.)

For instance, you can generate valid-looking phone numbers like this:

>>> :set -XOverloadedStrings
>>> let dig09 = Range '0' '9'
>>> let dig19 = Range '1' '9'
>>> let pre   = dig19 * Loop 2 2 dig09
>>> let post  = dig19 * Loop 3 3 dig09
>>> let phone = pre * "-" * post
>>> sat $ \s -> (s :: SString) `match` phone
Satisfiable. Model:
  s0 = "100-8000" :: String

Methods

match :: a -> RegExp -> SBool Source #

match s r checks whether s is in the language generated by r.

Instances

Instances details
RegExpMatchable SString Source #

Matching symbolic strings.

Instance details

Defined in Data.SBV.RegExp

Methods

match :: SString -> RegExp -> SBool Source #

RegExpMatchable SChar Source #

Matching a character simply means the singleton string matches the regex.

Instance details

Defined in Data.SBV.RegExp

Methods

match :: SChar -> RegExp -> SBool Source #

Constructing regular expressions

Literals

exactly :: String -> RegExp Source #

A literal regular-expression, matching the given string exactly. Note that with OverloadedStrings extension, you can simply use a Haskell string to mean the same thing, so this function is rarely needed.

>>> prove $ \(s :: SString) -> s `match` exactly "LITERAL" .<=> s .== "LITERAL"
Q.E.D.

A class of characters

oneOf :: String -> RegExp Source #

Helper to define a character class.

>>> prove $ \(c :: SChar) -> c `match` oneOf "ABCD" .<=> sAny (c .==) (map literal "ABCD")
Q.E.D.

Spaces

newline :: RegExp Source #

Recognize a newline. Also includes carriage-return and form-feed.

>>> newline
(re.union (str.to.re "\n") (str.to.re "\r") (str.to.re "\f"))
>>> prove $ \c -> c `match` newline .=> isSpace c
Q.E.D.

whiteSpaceNoNewLine :: RegExp Source #

Recognize white-space, but without a new line.

>>> whiteSpaceNoNewLine
(re.union (str.to.re "\x09") (re.union (str.to.re "\v") (str.to.re "\xa0") (str.to.re " ")))
>>> prove $ \c -> c `match` whiteSpaceNoNewLine .=> c `match` whiteSpace .&& c ./= literal '\n'
Q.E.D.

whiteSpace :: RegExp Source #

Recognize white space.

>>> prove $ \c -> c `match` whiteSpace .=> isSpace c
Q.E.D.

Separators

tab :: RegExp Source #

Recognize a tab.

>>> tab
(str.to.re "\x09")
>>> prove $ \c -> c `match` tab .=> c .== literal '\t'
Q.E.D.

punctuation :: RegExp Source #

Recognize a punctuation character. Anything that satisfies the predicate isPunctuation will be accepted. (TODO: Will need modification when we move to unicode.)

>>> prove $ \c -> c `match` punctuation .=> isPunctuation c
Q.E.D.

Letters

asciiLetter :: RegExp Source #

Recognize an alphabet letter, i.e., A..Z, a..z.

>>> asciiLetter
(re.union (re.range "a" "z") (re.range "A" "Z"))
>>> prove $ \c -> c `match` asciiLetter .<=> toUpper c `match` asciiLetter
Q.E.D.
>>> prove $ \c -> c `match` asciiLetter .<=> toLower c `match` asciiLetter
Q.E.D.

asciiLower :: RegExp Source #

Recognize an ASCII lower case letter

>>> asciiLower
(re.range "a" "z")
>>> prove $ \c -> (c :: SChar) `match` asciiLower  .=> c `match` asciiLetter
Q.E.D.
>>> prove $ \c -> c `match` asciiLower  .=> toUpper c `match` asciiUpper
Q.E.D.
>>> prove $ \c -> c `match` asciiLetter .=> toLower c `match` asciiLower
Q.E.D.

asciiUpper :: RegExp Source #

Recognize an upper case letter

>>> asciiUpper
(re.range "A" "Z")
>>> prove $ \c -> (c :: SChar) `match` asciiUpper  .=> c `match` asciiLetter
Q.E.D.
>>> prove $ \c -> c `match` asciiUpper  .=> toLower c `match` asciiLower
Q.E.D.
>>> prove $ \c -> c `match` asciiLetter .=> toUpper c `match` asciiUpper
Q.E.D.

Digits

digit :: RegExp Source #

Recognize a digit. One of 0..9.

>>> digit
(re.range "0" "9")
>>> prove $ \c -> c `match` digit .<=> let v = digitToInt c in 0 .<= v .&& v .< 10
Q.E.D.

octDigit :: RegExp Source #

Recognize an octal digit. One of 0..7.

>>> octDigit
(re.range "0" "7")
>>> prove $ \c -> c `match` octDigit .<=> let v = digitToInt c in 0 .<= v .&& v .< 8
Q.E.D.
>>> prove $ \(c :: SChar) -> c `match` octDigit .=> c `match` digit
Q.E.D.

hexDigit :: RegExp Source #

Recognize a hexadecimal digit. One of 0..9, a..f, A..F.

>>> hexDigit
(re.union (re.range "0" "9") (re.range "a" "f") (re.range "A" "F"))
>>> prove $ \c -> c `match` hexDigit .<=> let v = digitToInt c in 0 .<= v .&& v .< 16
Q.E.D.
>>> prove $ \(c :: SChar) -> c `match` digit .=> c `match` hexDigit
Q.E.D.

Numbers

decimal :: RegExp Source #

Recognize a decimal number.

>>> decimal
(re.+ (re.range "0" "9"))
>>> prove $ \s -> (s::SString) `match` decimal .=> sNot (s `match` KStar asciiLetter)
Q.E.D.

octal :: RegExp Source #

Recognize an octal number. Must have a prefix of the form 0o/0O.

>>> octal
(re.++ (re.union (str.to.re "0o") (str.to.re "0O")) (re.+ (re.range "0" "7")))
>>> prove $ \s -> s `match` octal .=> sAny (.== take 2 s) ["0o", "0O"]
Q.E.D.

hexadecimal :: RegExp Source #

Recognize a hexadecimal number. Must have a prefix of the form 0x/0X.

>>> hexadecimal
(re.++ (re.union (str.to.re "0x") (str.to.re "0X")) (re.+ (re.union (re.range "0" "9") (re.range "a" "f") (re.range "A" "F"))))
>>> prove $ \s -> s `match` hexadecimal .=> sAny (.== take 2 s) ["0x", "0X"]
Q.E.D.

floating :: RegExp Source #

Recognize a floating point number. The exponent part is optional if a fraction is present. The exponent may or may not have a sign.

>>> prove $ \s -> s `match` floating .=> length s .>= 3
Q.E.D.

Identifiers

identifier :: RegExp Source #

For the purposes of this regular expression, an identifier consists of a letter followed by zero or more letters, digits, underscores, and single quotes. The first letter must be lowercase.

>>> prove $ \s -> s `match` identifier .=> isAsciiLower (head s)
Q.E.D.
>>> prove $ \s -> s `match` identifier .=> length s .>= 1
Q.E.D.