Safe Haskell | Safe-Inferred |
---|---|
Language | GHC2021 |
Synopsis
- selectTokenizerByContent :: Text -> Tokenizer
- tokenize :: Tokenizer
- fromDevanagari :: Tokenizer
- fromIso :: Tokenizer
- fromHarvard :: Tokenizer
- fromIast :: Tokenizer
- type Tokenizer = Text -> Seq DevanagariToken
Documentation
selectTokenizerByContent :: Text -> Tokenizer Source #
select the correct tokenizer based on the content of the input string.
tokenize :: Tokenizer Source #
tokenize a string of Text into a sequence of DevanagariTokens. The actual tokenizer is selected based on the content of the input string. This tokenizer is then applied to the input string.
fromDevanagari :: Tokenizer Source #
a tokenizer function that parses a Text containing Devanagari script into a Sequence of DevanagariToken instances.
a tokenizer function that parses a Text containing ISO15919 encoded Devanagari script into a Sequence of DevanagariToken instances.
fromHarvard :: Tokenizer Source #
a tokenizer function that parses a Text containing Harvard-Kyoto encoded Devanagari script into a Sequence of DevanagariToken instances.