regex-with-pcre-1.0.1.0: Toolkit for regex-base

Safe HaskellNone
LanguageHaskell2010

Text.RE.PCRE

Contents

Synopsis

Tutorial

We have a regex tutorial at http://tutorial.regex.uk.

About this Module

This module provides access to the back end through polymorphic functions that operate over all of the String/Text/ByteString types supported by the back end. The module also provides all of the specialised back-end functionality that will not be needed by most regex clientts. If you don't need this generality then you might want to consider using one of the simpler modules that have been specialised for each of these types:

The Matches and Match Operators

(*=~) :: IsRegex RE s => s -> RE -> Matches s Source #

find all the matches in the argument text; e.g., to count the number of naturals in s:

countMatches $ s *=~ [re|[0-9]+|]

(?=~) :: IsRegex RE s => s -> RE -> Match s Source #

find the first match in the argument text; e.g., to test if there is a natural number in the input text:

matched $ s ?=~ [re|[0-9]+|]

The SearchReplace Operators

(*=~/) :: IsRegex RE s => s -> SearchReplace RE s -> s Source #

search and replace all matches in the argument text; e.g., this section will convert every YYYY-MM-DD format date in its argument text into a DD/MM/YYYY date:

(*=~/ [ed|${y}([0-9]{4})-0*${m}([0-9]{2})-0*${d}([0-9]{2})///${d}/${m}/${y}|])

(?=~/) :: IsRegex RE s => s -> SearchReplace RE s -> s Source #

search and replace the first occurrence only (if any) in the input text e.g., to prefix the first string of four hex digits in the imput text, if any, with 0x:

@(?=~/ [ed|[0-9A-Fa-f]{4}///0x$0|])

The Matches Type

data Matches a :: * -> * #

the result of matching a RE against a text (with *=~), retaining the text that was matched against

Instances

Functor Matches 

Methods

fmap :: (a -> b) -> Matches a -> Matches b #

(<$) :: a -> Matches b -> Matches a #

(RegexContext regex source [MatchText source], RegexLike regex source) => RegexContext regex source (Matches source)

this instance hooks Matches into regex-base: regex consumers need not worry about any of this

Methods

match :: regex -> source -> Matches source #

matchM :: Monad m => regex -> source -> m (Matches source) #

Eq a => Eq (Matches a) 

Methods

(==) :: Matches a -> Matches a -> Bool #

(/=) :: Matches a -> Matches a -> Bool #

Show a => Show (Matches a) 

Methods

showsPrec :: Int -> Matches a -> ShowS #

show :: Matches a -> String #

showList :: [Matches a] -> ShowS #

matchesSource :: Matches a -> a #

the source text being matched

allMatches :: Matches a -> [Match a] #

all Match instances found, left to right

anyMatches :: Matches a -> Bool #

tests whether the RE matched the source text at all

countMatches :: Matches a -> Int #

count the matches

matches :: Matches a -> [a] #

list the texts that Matched

The Match Type

data Match a :: * -> * #

the result of matching a RE to a text once (with ?=~), retaining the text that was matched against

Instances

Functor Match 

Methods

fmap :: (a -> b) -> Match a -> Match b #

(<$) :: a -> Match b -> Match a #

(RegexContext regex source (AllTextSubmatches (Array Int) (source, (Int, Int))), RegexLike regex source) => RegexContext regex source (Match source)

this instance hooks Match into regex-base: regex consumers need not worry about any of this

Methods

match :: regex -> source -> Match source #

matchM :: Monad m => regex -> source -> m (Match source) #

Eq a => Eq (Match a) 

Methods

(==) :: Match a -> Match a -> Bool #

(/=) :: Match a -> Match a -> Bool #

Show a => Show (Match a) 

Methods

showsPrec :: Int -> Match a -> ShowS #

show :: Match a -> String #

showList :: [Match a] -> ShowS #

matchSource :: Match a -> a #

the whole source text

matched :: Match a -> Bool #

tests whether the RE matched the source text at all

matchedText :: Match a -> Maybe a #

yields the text matched by the RE, Nothing if no match

The Macros and Parsers

There are a number of RE macros and corresponding Haskell parsers for parsing the matched text into appropriate Haskell types. See the Macros Tables for details.

The RE Type

data RE Source #

the RE type for this back end representing a well-formed, compiled RE

regexType :: RegexType Source #

some functions in the Text.RE.TestBench need the back end to be passed dynamically as a RegexType parameters: use regexType fpr this back end

reOptions :: RE -> REOptions Source #

extract the REOptions from the RE

reSource :: RE -> String Source #

extract the RE source string from the RE

reCaptureNames :: RE -> CaptureNames Source #

extract the CaptureNames from the RE

reRegex :: RE -> Regex Source #

extract the back end compiled Regex type from the RE

Options

You can specify different compilation options by appending a to the name of an [re| ... |] or [ed| ... /// ... |] quasi quoter to select the corresponding compilation option. For example, the section,

(?=~/ [edBlockInsensitive|foo$///bar|])

will replace a foo suffix of the argument text, of any capitalisation, with a (lower case) bar. If you need to specify the options dynamically, use the [re_| ... |] and [ed_| ... /// ... |] quasi quoters, which generate functions that take an IsOption option (e.g., a SimpleReOptions value) and yields a RE or SearchReplace as apropriate. For example if you have a SimpleReOptions value in sro then

(?=~/ [ed_|foo$///bar|] sro)

will compile the foo$ RE according to the value of sro. For more on specifying RE options see Text.RE.REOptions.

data SimpleREOptions :: * #

the default API uses these simple, universal RE options, which get auto-converted into the apropriate back-end REOptions_

Constructors

MultilineSensitive

case-sensitive with ^ and $ matching the start and end of a line

MultilineInsensitive

case-insensitive with ^ and $ matsh the start and end of a line

BlockSensitive

case-sensitive with ^ and $ matching the start and end of the input text

BlockInsensitive

case-insensitive with ^ and $ matching the start and end of the input text

Instances

Bounded SimpleREOptions 
Enum SimpleREOptions 
Eq SimpleREOptions 
Ord SimpleREOptions 
Show SimpleREOptions 
Lift SimpleREOptions

we need to use this in the quasi quoters to specify SimpleREOptions selected by the quasi quoter

Methods

lift :: SimpleREOptions -> Q Exp #

IsOption SimpleREOptions Source # 

class IsOption o where Source #

a number of types can be used to encode REOptions_, each of which is made a member of this class

Minimal complete definition

makeREOptions

Methods

makeREOptions :: o -> REOptions Source #

convert the o type into an REOptions

type REOptions = REOptions_ RE CompOption ExecOption Source #

and the REOptions for this back end (see Text.RE.REOptions for details)

noPreludeREOptions :: REOptions Source #

the default REOptions but with no RE macros defined

unpackSimpleREOptions :: SimpleREOptions -> REOptions Source #

convert a universal SimpleReOptions into the REOptions used by this back end

Compiling and Escaping REs

data SearchReplace re s :: * -> * -> * #

contains a compiled RE and replacement template

Constructors

SearchReplace 

Fields

  • getSearch :: ~re

    the RE to match a string to replace

  • getTemplate :: ~s

    the replacement template with ${cap} used to identify a capture (by number or name if one was given) and $$ being used to escape a single $

Instances

Functor (SearchReplace re) 

Methods

fmap :: (a -> b) -> SearchReplace re a -> SearchReplace re b #

(<$) :: a -> SearchReplace re b -> SearchReplace re a #

(Show s, Show re) => Show (SearchReplace re s) 

Methods

showsPrec :: Int -> SearchReplace re s -> ShowS #

show :: SearchReplace re s -> String #

showList :: [SearchReplace re s] -> ShowS #

compileRegex :: (Functor m, Monad m) => String -> m RE Source #

compile a String into a RE with the default options, generating an error if the RE is not well formed

compileRegexWith :: (Functor m, Monad m) => SimpleREOptions -> String -> m RE Source #

compile a String into a RE using the given SimpleREOptions, generating an error if the RE is not well formed

compileRegexWithOptions :: (IsOption o, Functor m, Monad m) => o -> String -> m RE Source #

compile a String into a RE using the given SimpleREOptions, generating an error if the RE is not well formed

compileSearchReplace :: (Monad m, Functor m, IsRegex RE s) => String -> String -> m (SearchReplace RE s) Source #

compile a SearchReplace template generating errors if the RE or the template are not well formed, all capture references being checked

compileSearchReplaceWith :: (Monad m, Functor m, IsRegex RE s) => SimpleREOptions -> String -> String -> m (SearchReplace RE s) Source #

compile a SearchReplace template, with simple options, generating errors if the RE or the template are not well formed, all capture references being checked

compileSearchReplaceWithOptions :: (Monad m, Functor m, IsRegex RE s) => REOptions -> String -> String -> m (SearchReplace RE s) Source #

compile a SearchReplace template, with general options, generating errors if the RE or the template are not well formed, all capture references being checked

escape :: (Functor m, Monad m) => (String -> String) -> String -> m RE Source #

convert a string into a RE that matches that string, and apply it to an argument continuation function to make up the RE string to be compiled; e.g., to compile a RE that will only match the string:

maybe undefined id . escape (("^"++) . (++"$"))

escapeWith :: (Functor m, Monad m) => SimpleREOptions -> (String -> String) -> String -> m RE Source #

a variant of escape where the SimpleREOptions are specified

escapeWithOptions :: (IsOption o, Functor m, Monad m) => o -> (String -> String) -> String -> m RE Source #

a variant of escapeWith that allows an IsOption RE option to be specified

escapeREString :: String -> String #

Convert a string into a regular expression that will match that string

The Classic regex-base Match Operators

(=~) :: (RegexContext Regex s a, RegexMaker Regex CompOption ExecOption s) => s -> RE -> a Source #

the regex-base polymorphic match operator

(=~~) :: (Monad m, RegexContext Regex s a, RegexMaker Regex CompOption ExecOption s) => s -> RE -> m a Source #

the regex-base monadic, polymorphic match operator

The re Quasi Quoters

The [re|.*|] quasi quoters, with variants for specifing different options to the RE compiler (see Text.RE.REOptions), and the specialised back-end types and functions.

re :: QuasiQuoter Source #

[re| ... |], is equivalent to [reMultilineSensitive| ... |], compiling a case-sensitive, multi-line RE

reMultilineSensitive :: QuasiQuoter Source #

[reMultilineSensitive| ... |], compiles a case-sensitive, multi-line RE

reMultilineInsensitive :: QuasiQuoter Source #

[reMultilineInsensitive| ... |], compiles a case-insensitive, multi-line RE

reBlockSensitive :: QuasiQuoter Source #

[reMultilineInsensitive| ... |], compiles a case-sensitive, non-multi-line RE

reBlockInsensitive :: QuasiQuoter Source #

[reMultilineInsensitive| ... |], compiles a case-insensitive, non-multi-line RE

reMS :: QuasiQuoter Source #

[reMS| ... |] is a shorthand for [reMultilineSensitive| ... |]

reMI :: QuasiQuoter Source #

[reMI| ... |] is a shorthand for [reMultilineInsensitive| ... |]

reBS :: QuasiQuoter Source #

[reBS| ... |] is a shorthand for [reBlockSensitive| ... |]

reBI :: QuasiQuoter Source #

[reBI| ... |] is a shorthand for [reBlockInsensitive| ... |]

re_ :: QuasiQuoter Source #

[re_| ... |] compiles a RE to produce a function that takes the RE options (e.g., a SimpleREOptions value) and yields the RE compiled with those options. For example,

countMatches $ s *=~ [re_|[0-9a-f]+|] MultilineInsensitive

counts the number of hexadecimal digit strings in s, allowing upper- or lower-case hex didgits.

The Ed Quasi Quoters

The -- | the [ed| ... /// ... |] quasi quoters; for example,

[ed|${y}([0-9]{4})-0*${m}([0-9]{2})-0*${d}([0-9]{2})///${d}/${m}/${y}|])

represents a SearchReplace that will convert a YYYY-MM-DD format date into a DD/MM/YYYY format date.

The only difference betweem these quasi quoters is the RE options that are set, using the same conventions as the [re| ... |] quasi quoters.

ed :: QuasiQuoter Source #

the [ed| ... /// ... |] quasi quoters

edMultilineSensitive :: QuasiQuoter Source #

the [ed| ... /// ... |] quasi quoters

edMultilineInsensitive :: QuasiQuoter Source #

the [ed| ... /// ... |] quasi quoters

edBlockSensitive :: QuasiQuoter Source #

the [ed| ... /// ... |] quasi quoters

edBlockInsensitive :: QuasiQuoter Source #

the [ed| ... /// ... |] quasi quoters

edMS :: QuasiQuoter Source #

the [ed| ... /// ... |] quasi quoters

edMI :: QuasiQuoter Source #

the [ed| ... /// ... |] quasi quoters

edBS :: QuasiQuoter Source #

the [ed| ... /// ... |] quasi quoters

edBI :: QuasiQuoter Source #

the [ed| ... /// ... |] quasi quoters

ed_ :: QuasiQuoter Source #

the [ed| ... /// ... |] quasi quoters

The cp Quasi Quoters

cp :: QuasiQuoter #

quasi quoter for CaptureID: [cp|0|], [cp|0|], etc., indexing captures by classic positional numbers, and [cp|foo|], etc., referencing a named capture [re| ... ${foo}( ... ) ... |].

RE Macros Standard Environment

prelude :: Macros RE Source #

the standard table of Macros used to compile REs (which can be extended or replace: see Text.RE.TestBench)

preludeEnv :: MacroEnv Source #

the standard MacroEnv for this back end (see Text.RE.TestBench)

preludeTestsFailing :: [MacroID] Source #

the macros in the standard environment that are failing their tests (checked by the test suite to be empty)

preludeTable :: String Source #

a table the standard macros in markdown format

preludeSummary :: PreludeMacro -> String Source #

a summary of the macros in the standard environment for this back end in plain text

preludeSources :: String Source #

a listing of the RE text for each macro in the standard environment with all macros expanded to normal form

preludeSource :: PreludeMacro -> String Source #

the prolude source of a given macro in the standard environment

IsRegex

The IsRegex class is used to abstact over the different regex back ends and the text types they work with -- see Text.RE.Tools.IsRegex for details.

The IsRegex Instances

These module exportss merely provide the IsRegex instances.