Copyright | (c) 2020 Composewell Technologies and Contributors |
---|---|
License | Apache-2.0 |
Maintainer | streamly@composewell.com |
Stability | experimental |
Safe Haskell | None |
Language | Haskell2010 |
General character property related functions.
Synopsis
- isLetter :: Char -> Bool
- isSpace :: Char -> Bool
- isJamo :: Char -> Bool
- jamoNCount :: Int
- jamoLFirst :: Int
- jamoLIndex :: Char -> Maybe Int
- jamoLLast :: Int
- jamoVFirst :: Int
- jamoVCount :: Int
- jamoVIndex :: Char -> Maybe Int
- jamoVLast :: Int
- jamoTFirst :: Int
- jamoTCount :: Int
- jamoTIndex :: Char -> Maybe Int
- jamoTLast :: Int
- hangulFirst :: Int
- hangulLast :: Int
- isHangul :: Char -> Bool
- isHangulLV :: Char -> Bool
Character Properties
isLetter :: Char -> Bool Source #
Returns True
for alphabetic Unicode characters (lower-case, upper-case
and title-case letters, plus letters of caseless scripts and modifiers
letters).
isLetter c == Data.Char.isLetter c
isSpace :: Char -> Bool Source #
Returns True
for any whitespace characters, and the control
characters \t
, \n
, \r
, \f
, \v
.
isSpace c == Data.Char.isSpace c
Korean Hangul Characters
The Hangul script used in the Korean writing system consists of individual consonant and vowel letters (jamo) that are visually combined into square display cells to form entire syllable blocks. Hangul syllables may be encoded directly as precomposed combinations of individual jamo or as decomposed sequences of conjoining jamo. Modern Hangul syllable blocks can be expressed with either two or three jamo, either in the form consonant + vowel or in the form consonant + vowel + consonant. The leading consonant is represented as L, the vowel as V and the trailing consonant as T.
The Unicode Standard contains both a large set of precomposed modern Hangul syllables and a set of conjoining Hangul jamo, which can be used to encode archaic Korean syllable blocks as well as modern Korean syllable blocks.
Hangul characters can be composed or decomposed algorithmically instead of via mappings. These APIs are used mainly for Unicode normalization of Hangul text.
Please refer to the following resources for more information:
- The
Hangul
section of theEast Asia
chapter of the Unicode Standard - Conformance chapter of the Unicode Standard
- Unicode® Standard Annex #15 - Unicode Normalization Forms
- UCD file
HangulSyllableType.txt
- https://en.wikipedia.org/wiki/Hangul_Jamo_(Unicode_block)
- https://en.wikipedia.org/wiki/List_of_Hangul_jamo
Conjoining Jamo
Jamo L, V and T letters.
jamoNCount :: Int Source #
Total count of all jamo characters.
jamoNCount = jamoVCount * jamoTCount
Jamo Leading (L)
jamoLFirst :: Int Source #
First leading consonant jamo.
jamoLIndex :: Char -> Maybe Int Source #
Given a Unicode character, if it is a leading jamo, return its index in
the list of leading jamo consonants, otherwise return Nothing
.
Jamo Vowel (V)
jamoVFirst :: Int Source #
First vowel jamo.
jamoVCount :: Int Source #
Total count of vowel jamo.
jamoVIndex :: Char -> Maybe Int Source #
Given a Unicode character, if it is a vowel jamo, return its index in the
list of vowel jamo, otherwise return Nothing
.
Jamo Trailing (T)
jamoTFirst :: Int Source #
The first trailing consonant jamo.
Note that jamoTFirst
does not represent a valid T, it represents a missing
T i.e. LV without a T. See comments under jamoTIndex
.
jamoTCount :: Int Source #
Total count of trailing consonant jamo.
jamoTIndex :: Char -> Maybe Int Source #
Given a Unicode character, if it is a trailing jamo consonant, return its
index in the list of trailing jamo consonants, otherwise return Nothing
.
Note that index 0 is not a valid index for a trailing consonant. Index 0 corresponds to an LV syllable, without a T. See "Hangul Syllable Decomposition" in the Conformance chapter of the Unicode standard for more details.
Hangul Syllables
Precomposed Hangul syllables.
hangulFirst :: Int Source #
Codepoint of the first pre-composed Hangul character.
hangulLast :: Int Source #
Codepoint of the last Hangul character.
isHangulLV :: Char -> Bool Source #
Determine if the given character is a Hangul LV syllable.