re2-0.1: Bindings to the re2 regular expression library

LicenseMIT
Safe HaskellNone
LanguageHaskell98

Regex.RE2

Contents

Description

re2 is a regular expression library offering predictable run-time and memory consumption. This package is a binding to re2.

Supported expression syntax is documented at http://code.google.com/p/re2/wiki/Syntax.

$ ghci -XOverloadedStrings
ghci> import Regex.RE2

ghci> find "\\w+" "hello world"
Just (Match [Just "hello"])

ghci> find "\\w+$" "hello world"
Just (Match [Just "world"])

ghci> find "^\\w+$" "hello world"
Nothing

Synopsis

Compiling patterns

data Pattern Source

A pattern is a compiled regular expression plus its compilation options.

Patterns can be created by calling compile explicitly:

import Data.ByteString.Char8 (pack)

p :: Pattern
p = case compile (pack "^hello world$") of
        Right ok -> ok
        Left err -> error ("compilation error: " ++ errorMessage err)
 

Or by using the IsString instance:

import Data.String (fromString)

p :: Pattern
p = fromString "^hello world$"

Or by using the OverloadedStrings language extension:

{-# LANGUAGE OverloadedStrings #-}

p :: Pattern
p = "^hello world$"

compileWith :: Options -> ByteString -> Either Error Pattern Source

Compile a regular expression with the given options. If compilation fails, the error can be inspected with errorMessage and errorCode.

Use optionEncoding to select whether the input bytes should be interpreted as UTF-8 or Latin1. The default is UTF8.

Pattern properties

patternInput :: Pattern -> ByteString Source

The regular expression originally provided to compileWith.

patternOptions :: Pattern -> Options Source

The options originally provided to compileWith.

patternGroups :: Pattern -> Vector (Maybe ByteString) Source

The capturing groups defined within the pattern. Groups are listed from left to right, and are Nothing if the group is unnamed.

ghci> patternGroups "(\\d+)|(?P<word>\\w+)"
fromList [Nothing,Just "word"]

Options

data Options Source

Options controlling how to compile a regular expression. The fields in this value may be set using record syntax:

compileNoCase :: B.ByteString -> Either Error Pattern
compileNoCase = compileWith (defaultOptions { optionCaseSensitive = False })
 

Instances

defaultOptions :: Options Source

defaultOptions = Options
        { optionEncoding = EncodingUtf8
        , optionPosixSyntax = False
        , optionLongestMatch = False
        , optionMaxMemory = 8388608  -- 8 << 20
        , optionLiteral = False
        , optionNeverNewline = False
        , optionDotNewline = False
        , optionNeverCapture = False
        , optionCaseSensitive = True
        , optionPerlClasses = False
        , optionWordBoundary = False
        , optionOneLine = False
        }
 

optionPerlClasses :: Options -> Bool Source

Only checked in posix mode

optionWordBoundary :: Options -> Bool Source

Only checked in posix mode

optionOneLine :: Options -> Bool Source

Only checked in posix mode

Compilation errors

data Error Source

Instances

Matching

data Match Source

A successful match of the pattern against some input. Capturing groups may be retrieved with matchGroup or matchGroups.

Instances

matchGroup :: Match -> Int -> Maybe ByteString Source

The capturing group with the given index, or Nothing if the group was not set in this match.

The entire match is group 0.

matchGroups :: Match -> Vector (Maybe ByteString) Source

All of the groups in the pattern, with each group being Nothing if it was not set in this match. Groups are returned in the same order as patternGroups.

The entire match is group 0.

data Anchor Source

Constructors

AnchorStart 
AnchorBoth 

Instances

match Source

Arguments

:: Pattern 
-> ByteString 
-> Int

Start position

-> Int

End position

-> Maybe Anchor 
-> Int

How many match groups to populate

-> Maybe Match 

The most general matching function. Attempt to match the pattern to the input within the given constraints.

If the number of match groups to populate is 0, matching can be performed more efficiently.

Searching

find :: Pattern -> ByteString -> Maybe Match Source

Attempt to find the pattern somewhere within the input.

Replacing

replace Source

Arguments

:: Pattern 
-> ByteString

Input

-> ByteString

Replacement template

-> (ByteString, Bool) 

Replace the first occurance of the pattern with the given replacement template. If the template contains backslash escapes such as \1, the capture group with the given index will be inserted in their place.

Returns the new bytes, and True if a replacement occured.

replaceAll Source

Arguments

:: Pattern 
-> ByteString

Input

-> ByteString

Replacement template

-> (ByteString, Int) 

Replace every occurance of the pattern with the given replacement template. If the template contains backslash escapes such as \1, the capture group with the given index will be inserted in their place.

Returns the new bytes, and how many replacements occured.

extract Source

Arguments

:: Pattern 
-> ByteString

Input

-> ByteString

Extraction template

-> Maybe ByteString 

Attempt to find the pattern somewhere within the input, and extract it using the given template. If the template contains backslash escapes such as \1, the capture group with the given index will be inserted in their place.

Returns Nothing if the pattern was not found in the input.

Utility functions

quoteMeta :: ByteString -> ByteString Source

Escapes bytes such that the output is a regular expression which will exactly match the input.