text-utf8-1.2.3.0: An efficient packed UTF-8 backed Unicode text type.

Copyright(c) 2008 2009 Tom Harper
(c) 2009 2010 Bryan O'Sullivan
(c) 2009 Duncan Coutts
LicenseBSD-style
Maintainerbos@serpentine.com
Stabilityexperimental
PortabilityGHC
Safe HaskellNone
LanguageHaskell2010

Data.Text.Internal.Encoding.Utf8

Contents

Description

Warning: this is an internal module, and does not have a stable API or name. Functions in this module may not check or enforce preconditions expected by public modules. Use at your own risk!

Basic UTF-8 validation and character manipulation.

Synopsis

Documentation

Validation

continuationByte :: Word8 -> Bool Source #

Utility function: check if a word is an UTF-8 continuation byte

decodeChar :: (Char -> Int -> a) -> Word8 -> Word8 -> Word8 -> Word8 -> a Source #

Hybrid combination of unsafeChr8, chr2, chr3 and chr4. This function will not touch the bytes it doesn't need.

decodeCharIndex :: (Char -> Int -> a) -> (Int -> Word8) -> Int -> a Source #

Version of decodeChar which works with an indexing function.

reverseDecodeCharIndex :: (Char -> Int -> a) -> (Int -> Word8) -> Int -> a Source #

Version of decodeCharIndex that takes the rightmost index and tracks back to the left. Note that this function requires that the input is valid unicode.

encodeChar :: (Word8 -> a) -> (Word8 -> Word8 -> a) -> (Word8 -> Word8 -> Word8 -> a) -> (Word8 -> Word8 -> Word8 -> Word8 -> a) -> Char -> a Source #

This function provides fast UTF-8 encoding of characters because the user can supply custom functions for the different code paths, which should be inlined properly.

charTailBytes :: Char -> Int Source #

Count the number of UTF-8 tail bytes needed to encode a character