unicode-normalization-0.1: Unicode normalization using the ICU librarySource codeContentsIndex
Text.Unicode.Normalization
Description
This module contains functions to do Unicode normalization of CompactStrings.
Synopsis
data NormalizationMode
= NFD
| NFKD
| NFC
| NFKC
| FCD
normalizationToCInt :: NormalizationMode -> CInt
data NormalizationOption = Unicode3_2
normalize :: CompactString UTF16Native -> NormalizationMode -> [NormalizationOption] -> CompactString UTF16Native
data NormalizationCheckResult
= Normalized
| NotNormalized
| MaybeNormalized
quickCheck :: CompactString UTF16Native -> NormalizationMode -> [NormalizationOption] -> NormalizationCheckResult
isNormalized :: CompactString UTF16Native -> NormalizationMode -> [NormalizationOption] -> Bool
concatenate :: CompactString UTF16Native -> CompactString UTF16Native -> NormalizationMode -> [NormalizationOption] -> CompactString UTF16Native
data ComparisonOption
= InputIsFCD
| IgnoreCase
| CompareCodePointOrder
compare :: CompactString UTF16Native -> CompactString UTF16Native -> [ComparisonOption] -> Ordering
Documentation
data NormalizationMode Source
A data type for representing an ICU Normalization type. You use this to specify how you'd like ICU to normalize your string.
Constructors
NFD
NFKD
NFC
NFKC
FCD
show/hide Instances
normalizationToCInt :: NormalizationMode -> CIntSource
Internal function to convert a NormalizationMode to its C enum value
data NormalizationOption Source

Options to pass to normalize.

There is only one option ATM.

Constructors
Unicode3_2Normalize according to Unicode 3.2
show/hide Instances
normalize :: CompactString UTF16Native -> NormalizationMode -> [NormalizationOption] -> CompactString UTF16NativeSource

Normalizes the given string, according to the given normalization type and options.

This function is a higher-level wrapper around raw_normalize.

Move this to something like Data.CompactString.Normalization, eventually.

Generalize out the UErrorCode handling.

data NormalizationCheckResult Source
A type for the result of a quick normalization check.
Constructors
Normalized
NotNormalized
MaybeNormalized
show/hide Instances
quickCheck :: CompactString UTF16Native -> NormalizationMode -> [NormalizationOption] -> NormalizationCheckResultSource

Attempts to check quickly whether a string is already normalized according to a certain normalization mode.

When you get MaybeNormalized as a result, you should normalize the string and compare it to the original to know if it is normalized. You can make ICU do that by calling isNormalized.

isNormalized :: CompactString UTF16Native -> NormalizationMode -> [NormalizationOption] -> BoolSource
Tells of a string whether it is already normalized according to a certain mode and options
concatenate :: CompactString UTF16Native -> CompactString UTF16Native -> NormalizationMode -> [NormalizationOption] -> CompactString UTF16NativeSource

Concatenates two normalized strings, such that the result is also normalized.

More formally: Given that string1 is normalized according to mode and options, and string2 is normalized according to mode and options, the result of concatenate string1 string2 mode options will be a concatenation of string1 and string2 and be normalized according to mode and options.

data ComparisonOption Source
A data type to encode options to the compare function.
Constructors
InputIsFCDAssume that both strings are FCD normalized
IgnoreCaseDo case-insensitive comparison
CompareCodePointOrderCompare by code point order (default is code unit order)
show/hide Instances
compare :: CompactString UTF16Native -> CompactString UTF16Native -> [ComparisonOption] -> OrderingSource

Compares two Unicode strings for canonical equivalence.

Two Unicode strings are canonically equivalent when their NFD and NFC normalizations are equal.

Produced by Haddock version 2.4.2