unicode-data-0.3.1: Access Unicode Character Database (UCD)
Copyright(c) 2020 Composewell Technologies and Contributors
LicenseApache-2.0
Maintainerstreamly@composewell.com
Stabilityexperimental
Safe HaskellSafe-Inferred
LanguageHaskell2010

Unicode.Char

Description

This module provides APIs to access the Unicode character database (UCD) corresponding to Unicode Standard version 14.0.0.

This module re-exports several sub-modules under it. The sub-module structure under Char is largely based on the "Property Index by Scope of Use" in Unicode® Standard Annex #44.

The Unicode.Char.* modules in turn depend on Unicode.Internal.Char.* modules which are programmatically generated from the Unicode standard's Unicode character database files. The module structure under Unicode.Internal.Char is largely based on the UCD text file names from which the properties are generated.

For the original UCD files used in this code please refer to the UCD section on the Unicode standard page. See https://www.unicode.org/reports/tr44/ to understand the contents and the format of the unicode database files.

Synopsis

Documentation

isAlpha :: Char -> Bool Source #

Same as isLetter.

Since: 0.3.0

isLowerCase :: Char -> Bool Source #

Returns True for lower-case characters.

It uses the character property Lowercase.

Since: 0.3.0

isLower :: Char -> Bool Source #

Deprecated: Use isLowerCase instead. Note that the behavior of this function does not match base:Data.Char.isLower. See Unicode.Char.Case.Compat for behavior compatible with base:Data.Char.

Returns True for lower-case characters.

It uses the character property Lowercase.

Since: 0.1.0

isUpperCase :: Char -> Bool Source #

Returns True for upper-case characters.

It uses the character property Uppercase.

Note: it does not match title-cased letters. Those are matched using: generalCategory c == TitlecaseLetter.

Since: 0.3.0

isUpper :: Char -> Bool Source #

Deprecated: Use isUpperCase instead. Note that the behavior of this function does not match base:Data.Char.isUpper. See Unicode.Char.Case.Compat for behavior compatible with base:Data.Char.

Returns True for upper-case characters.

It uses the character property Uppercase.

Note: it does not match title-cased letters. Those are matched using: generalCategory c == TitlecaseLetter.

Since: 0.1.0

caseFoldMapping :: Unfold Char Char Source #

Returns the full folded case mapping of a character if the character is changed, else nothing.

It uses the character property Case_Folding.

Since: 0.3.1

lowerCaseMapping :: Unfold Char Char Source #

Returns the full lower case mapping of a character if the character is changed, else nothing.

It uses the character property Lowercase_Mapping.

Since: 0.3.1

titleCaseMapping :: Unfold Char Char Source #

Returns the full title case mapping of a character if the character is changed, else nothing.

It uses the character property Titlecase_Mapping.

Since: 0.3.1

upperCaseMapping :: Unfold Char Char Source #

Returns the full upper case mapping of a character if the character is changed, else nothing.

It uses the character property Uppercase_Mapping.

Since: 0.3.1

toCaseFoldString :: Char -> String Source #

Convert a character to full folded case if defined, else to itself.

This function is mainly useful for performing caseless (also known as case insensitive) string comparisons.

A string x is a caseless match for a string y if and only if:

foldMap toCaseFoldString x == foldMap toCaseFoldString y

The result string may have more than one character, and may differ from applying toLowerString to the input string. For instance, “ﬓ” (U+FB13 Armenian small ligature men now) is case folded to the sequence “մ” (U+0574 Armenian small letter men) followed by “ն” (U+0576 Armenian small letter now), while “µ” (U+00B5 micro sign) is case folded to “μ” (U+03BC Greek small letter mu) instead of itself.

It uses the character property Case_Folding.

toCaseFoldString c == foldMap toCaseFoldString (toCaseFoldString c)

Since: 0.3.1

toLowerString :: Char -> String Source #

Convert a character to full lower case if defined, else to itself.

The result string may have more than one character. For instance, “İ” (U+0130 Latin capital letter I with dot above) maps to the sequence: “i” (U+0069 Latin small letter I) followed by “ ̇” (U+0307 combining dot above).

It uses the character property Lowercase_Mapping.

See: toLower for simple lower case conversion.

toLowerString c == foldMap toLowerString (toLowerString c)

Since: 0.3.1

toTitleString :: Char -> String Source #

Convert a character to full title case if defined, else to itself.

The result string may have more than one character. For instance, “fl” (U+FB02 Latin small ligature FL) is converted to the sequence: “F” (U+0046 Latin capital letter F) followed by “l” (U+006C Latin small letter L).

It uses the character property Titlecase_Mapping.

See: toTitle for simple title case conversion.

Since: 0.3.1

toUpperString :: Char -> String Source #

Convert a character to full upper case if defined, else to itself.

The result string may have more than one character. For instance, the German “ß” (U+00DF Eszett) maps to the two-letter sequence “SS”.

It uses the character property Uppercase_Mapping.

See: toUpper for simple upper case conversion.

toUpperString c == foldMap toUpperString (toUpperString c)

Since: 0.3.1

toUpper :: Char -> Char Source #

Convert a letter to the corresponding upper-case letter, if any. Any other character is returned unchanged.

It uses the character property Simple_Uppercase_Mapping.

See: upperCaseMapping and toUpperString for full upper case conversion.

toUpper c == Data.Char.toUpper c

Since: 0.3.0

toLower :: Char -> Char Source #

Convert a letter to the corresponding lower-case letter, if any. Any other character is returned unchanged.

It uses the character property Simple_Lowercase_Mapping.

See: lowerCaseMapping and toLowerString for full lower case conversion.

toLower c == Data.Char.toLower c

Since: 0.3.0

toTitle :: Char -> Char Source #

Convert a letter to the corresponding title-case or upper-case letter, if any. (Title case differs from upper case only for a small number of ligature letters.) Any other character is returned unchanged.

It uses the character property Simple_Titlecase_Mapping.

See: titleCaseMapping and toTitleString for full title case conversion.

toTitle c == Data.Char.toTitle c

Since: 0.3.0

unicodeVersion :: Version Source #

Version of Unicode standard used by unicode-data.

Since: 0.3.0

Re-export from base

ord :: Char -> Int #

The fromEnum method restricted to the type Char.

chr :: Int -> Char #

The toEnum method restricted to the type Char.