Safe Haskell	None
Language	Haskell2010

Data.Text.Lazy.Manipulate

Contents

Strict vs lazy types
Unicode
Fusion
Subwords
- Removing words
- Breaking on words
Character manipulation
Line manipulation
Ellipsis
Acronyms
Ordinals
Casing
Boundary predicates

Description

Manipulate identifiers and structurally non-complex pieces of text by delimiting word boundaries via a combination of whitespace, control-characters, and case-sensitivity.

Assumptions have been made about word boundary characteristics inherint in predominantely English text, please see individual function documentation for further details and behaviour.

Synopsis

takeWord :: Text -> Text
dropWord :: Text -> Text
stripWord :: Text -> Maybe Text
breakWord :: Text -> (Text, Text)
splitWords :: Text -> [Text]
lowerHead :: Text -> Text
upperHead :: Text -> Text
mapHead :: (Char -> Char) -> Text -> Text
indentLines :: Int -> Text -> Text
prependLines :: Text -> Text -> Text
toEllipsis :: Int64 -> Text -> Text
toEllipsisWith :: Int64 -> Text -> Text -> Text
toAcronym :: Text -> Maybe Text
toOrdinal :: Integral a => a -> Text
toTitle :: Text -> Text
toCamel :: Text -> Text
toPascal :: Text -> Text
toSnake :: Text -> Text
toSpinal :: Text -> Text
toTrain :: Text -> Text
isBoundary :: Char -> Bool
isWordBoundary :: Char -> Bool

Strict vs lazy types

This library provides functions for manipulating both strict and lazy Text types. The strict functions are provided by the Data.Text.Manipulate module, while the lazy functions are provided by the Data.Text.Lazy.Manipulate module.

Unicode

While this library supports Unicode in a similar fashion to the underlying text library, more explicit Unicode specific handling of word boundaries can be found in the text-icu library.

Fusion

Many functions in this module are subject to fusion, meaning that a pipeline of such functions will usually allocate at most one Text value.

Functions that can be fused by the compiler are documented with the phrase Subject to fusion.

Subwords

Removing words

takeWord :: Text -> Text Source #

O(n) Returns the first word, or the original text if no word boundary is encountered. Subject to fusion.

dropWord :: Text -> Text Source #

O(n) Return the suffix after dropping the first word. If no word boundary is encountered, the result will be empty. Subject to fusion.

stripWord :: Text -> Maybe Text Source #

O(n) Return the suffix after removing the first word, or Nothing if no word boundary is encountered.

>>> stripWord "HTML5Spaghetti"
Just "Spaghetti"

>>> stripWord "noboundaries"
Nothing

Breaking on words

breakWord :: Text -> (Text, Text) Source #

Break a piece of text after the first word boundary is encountered.

>>> breakWord "PascalCasedVariable"
("Pacal", "CasedVariable")

>>> breakWord "spinal-cased-variable"
("spinal", "cased-variable")

splitWords :: Text -> [Text] Source #

O(n) Split into a list of words delimited by boundaries.

>>> splitWords "SupercaliFrag_ilistic"
["Supercali","Frag","ilistic"]

Character manipulation

lowerHead :: Text -> Text Source #

Lowercase the first character of a piece of text.

>>> lowerHead "Title Cased"
"title Cased"

upperHead :: Text -> Text Source #

Uppercase the first character of a piece of text.

>>> upperHead "snake_cased"
"Snake_cased"

mapHead :: (Char -> Char) -> Text -> Text Source #

Apply a function to the first character of a piece of text.

Line manipulation

indentLines :: Int -> Text -> Text Source #

Indent newlines by the given number of spaces.

prependLines :: Text -> Text -> Text Source #

Prepend newlines with the given separator

Ellipsis

toEllipsis :: Int64 -> Text -> Text Source #

O(n) Truncate text to a specific length. If the text was truncated the ellipsis sign "..." will be appended.

See: toEllipsisWith

toEllipsisWith Source #

Arguments

:: Int64	Length.
-> Text	Ellipsis.
-> Text
-> Text

O(n) Truncate text to a specific length. If the text was truncated the given ellipsis sign will be appended.

Acronyms

toAcronym :: Text -> Maybe Text Source #

O(n) Create an adhoc acronym from a piece of cased text.

>>> toAcronym "AmazonWebServices"
Just "AWS"

>>> toAcronym "Learn-You Some_Haskell"
Just "LYSH"

>>> toAcronym "this_is_all_lowercase"
Nothing

Ordinals

toOrdinal :: Integral a => a -> Text Source #

Render an ordinal used to denote the position in an ordered sequence.

>>> toOrdinal (101 :: Int)
"101st"

>>> toOrdinal (12 :: Int)
"12th"

Casing

toTitle :: Text -> Text Source #

O(n) Convert casing to Title Cased Phrase. Subject to fusion.

toCamel :: Text -> Text Source #

O(n) Convert casing to camelCasedPhrase. Subject to fusion.

toPascal :: Text -> Text Source #

O(n) Convert casing to PascalCasePhrase. Subject to fusion.

toSnake :: Text -> Text Source #

O(n) Convert casing to snake_cased_phrase. Subject to fusion.

toSpinal :: Text -> Text Source #

O(n) Convert casing to spinal-cased-phrase. Subject to fusion.

toTrain :: Text -> Text Source #

O(n) Convert casing to Train-Cased-Phrase. Subject to fusion.

Boundary predicates

isBoundary :: Char -> Bool Source #

Returns True for any boundary character.

isWordBoundary :: Char -> Bool Source #

Returns True for any boundary or uppercase character.