text-manipulate-0.3.0.0: Case conversion, word boundary manipulation, and textual subjugation.
Safe HaskellNone
LanguageHaskell2010

Data.Text.Manipulate

Description

Manipulate identifiers and structurally non-complex pieces of text by delimiting word boundaries via a combination of whitespace, control-characters, and case-sensitivity.

Assumptions have been made about word boundary characteristics inherint in predominantely English text, please see individual function documentation for further details and behaviour.

Synopsis

Strict vs lazy types

This library provides functions for manipulating both strict and lazy Text types. The strict functions are provided by the Data.Text.Manipulate module, while the lazy functions are provided by the Data.Text.Lazy.Manipulate module.

Unicode

While this library supports Unicode in a similar fashion to the underlying text library, more explicit Unicode handling of word boundaries can be found in the text-icu library.

Fusion

Many functions in this module are subject to fusion, meaning that a pipeline of such functions will usually allocate at most one Text value.

Functions that can be fused by the compiler are documented with the phrase Subject to fusion.

Subwords

Removing words

takeWord :: Text -> Text Source #

O(n) Returns the first word, or the original text if no word boundary is encountered. Subject to fusion.

dropWord :: Text -> Text Source #

O(n) Return the suffix after dropping the first word. If no word boundary is encountered, the result will be empty. Subject to fusion.

stripWord :: Text -> Maybe Text Source #

O(n) Return the suffix after removing the first word, or Nothing if no word boundary is encountered.

>>> stripWord "HTML5Spaghetti"
Just "Spaghetti"
>>> stripWord "noboundaries"
Nothing

Breaking on words

breakWord :: Text -> (Text, Text) Source #

Break a piece of text after the first word boundary is encountered.

>>> breakWord "PascalCasedVariable"
("Pacal", "CasedVariable")
>>> breakWord "spinal-cased-variable"
("spinal", "cased-variable")

splitWords :: Text -> [Text] Source #

O(n) Split into a list of words delimited by boundaries.

>>> splitWords "SupercaliFrag_ilistic"
["Supercali","Frag","ilistic"]

Character manipulation

lowerHead :: Text -> Text Source #

Lowercase the first character of a piece of text.

>>> lowerHead "Title Cased"
"title Cased"

upperHead :: Text -> Text Source #

Uppercase the first character of a piece of text.

>>> upperHead "snake_cased"
"Snake_cased"

mapHead :: (Char -> Char) -> Text -> Text Source #

Apply a function to the first character of a piece of text.

Line manipulation

indentLines :: Int -> Text -> Text Source #

Indent newlines by the given number of spaces.

See: prependLines

prependLines :: Text -> Text -> Text Source #

Prepend newlines with the given separator

Ellipsis

toEllipsis :: Int -> Text -> Text Source #

O(n) Truncate text to a specific length. If the text was truncated the ellipsis sign "..." will be appended.

See: toEllipsisWith

toEllipsisWith Source #

Arguments

:: Int

Length.

-> Text

Ellipsis.

-> Text 
-> Text 

O(n) Truncate text to a specific length. If the text was truncated the given ellipsis sign will be appended.

Acronyms

toAcronym :: Text -> Maybe Text Source #

O(n) Create an adhoc acronym from a piece of cased text.

>>> toAcronym "AmazonWebServices"
Just "AWS"
>>> toAcronym "Learn-You Some_Haskell"
Just "LYSH"
>>> toAcronym "this_is_all_lowercase"
Nothing

Ordinals

toOrdinal :: Integral a => a -> Text Source #

Render an ordinal used to denote the position in an ordered sequence.

>>> toOrdinal (101 :: Int)
"101st"
>>> toOrdinal (12 :: Int)
"12th"

Casing

toTitle :: Text -> Text Source #

O(n) Convert casing to Title Cased Phrase. Subject to fusion.

toCamel :: Text -> Text Source #

O(n) Convert casing to camelCasedPhrase. Subject to fusion.

toPascal :: Text -> Text Source #

O(n) Convert casing to PascalCasePhrase. Subject to fusion.

toSnake :: Text -> Text Source #

O(n) Convert casing to snake_cased_phrase. Subject to fusion.

toSpinal :: Text -> Text Source #

O(n) Convert casing to spinal-cased-phrase. Subject to fusion.

toTrain :: Text -> Text Source #

O(n) Convert casing to Train-Cased-Phrase. Subject to fusion.

Boundary predicates

isBoundary :: Char -> Bool Source #

Returns True for any boundary character.

isWordBoundary :: Char -> Bool Source #

Returns True for any boundary or uppercase character.