Copyright | (c) 2009, 2010 Bryan O'Sullivan |
---|---|
License | BSD-style |
Maintainer | bos@serpentine.com |
Stability | experimental |
Portability | portable |
Safe Haskell | Trustworthy |
Language | Haskell98 |
Functions for converting lazy Text
values to and from lazy
ByteString
, using several standard encodings.
To gain access to a much larger variety of encodings, use the
text-icu
package: http://hackage.haskell.org/package/text-icu
- decodeASCII :: ByteString -> Text
- decodeLatin1 :: ByteString -> Text
- decodeUtf8 :: ByteString -> Text
- decodeUtf16LE :: ByteString -> Text
- decodeUtf16BE :: ByteString -> Text
- decodeUtf32LE :: ByteString -> Text
- decodeUtf32BE :: ByteString -> Text
- decodeUtf8' :: ByteString -> Either UnicodeException Text
- decodeUtf8With :: OnDecodeError -> ByteString -> Text
- decodeUtf16LEWith :: OnDecodeError -> ByteString -> Text
- decodeUtf16BEWith :: OnDecodeError -> ByteString -> Text
- decodeUtf32LEWith :: OnDecodeError -> ByteString -> Text
- decodeUtf32BEWith :: OnDecodeError -> ByteString -> Text
- encodeUtf8 :: Text -> ByteString
- encodeUtf16LE :: Text -> ByteString
- encodeUtf16BE :: Text -> ByteString
- encodeUtf32LE :: Text -> ByteString
- encodeUtf32BE :: Text -> ByteString
- encodeUtf8Builder :: Text -> Builder
- encodeUtf8BuilderEscaped :: BoundedPrim Word8 -> Text -> Builder
Decoding ByteStrings to Text
All of the single-parameter functions for decoding bytestrings encoded in one of the Unicode Transformation Formats (UTF) operate in a strict mode: each will throw an exception if given invalid input.
Each function has a variant, whose name is suffixed with -With
,
that gives greater control over the handling of decoding errors.
For instance, decodeUtf8
will throw an exception, but
decodeUtf8With
allows the programmer to determine what to do on a
decoding error.
decodeASCII :: ByteString -> Text Source
Deprecated: Use decodeUtf8 instead
Deprecated. Decode a ByteString
containing 7-bit ASCII
encoded text.
This function is deprecated. Use decodeLatin1
instead.
decodeLatin1 :: ByteString -> Text Source
Decode a ByteString
containing Latin-1 (aka ISO-8859-1) encoded text.
decodeUtf8 :: ByteString -> Text Source
Decode a ByteString
containing UTF-8 encoded text that is known
to be valid.
If the input contains any invalid UTF-8 data, an exception will be
thrown that cannot be caught in pure code. For more control over
the handling of invalid data, use decodeUtf8'
or
decodeUtf8With
.
decodeUtf16LE :: ByteString -> Text Source
Decode text from little endian UTF-16 encoding.
If the input contains any invalid little endian UTF-16 data, an
exception will be thrown. For more control over the handling of
invalid data, use decodeUtf16LEWith
.
decodeUtf16BE :: ByteString -> Text Source
Decode text from big endian UTF-16 encoding.
If the input contains any invalid big endian UTF-16 data, an
exception will be thrown. For more control over the handling of
invalid data, use decodeUtf16BEWith
.
decodeUtf32LE :: ByteString -> Text Source
Decode text from little endian UTF-32 encoding.
If the input contains any invalid little endian UTF-32 data, an
exception will be thrown. For more control over the handling of
invalid data, use decodeUtf32LEWith
.
decodeUtf32BE :: ByteString -> Text Source
Decode text from big endian UTF-32 encoding.
If the input contains any invalid big endian UTF-32 data, an
exception will be thrown. For more control over the handling of
invalid data, use decodeUtf32BEWith
.
Catchable failure
decodeUtf8' :: ByteString -> Either UnicodeException Text Source
Decode a ByteString
containing UTF-8 encoded text..
If the input contains any invalid UTF-8 data, the relevant exception will be returned, otherwise the decoded text.
Note: this function is not lazy, as it must decode its entire
input before it can return a result. If you need lazy (streaming)
decoding, use decodeUtf8With
in lenient mode.
Controllable error handling
decodeUtf8With :: OnDecodeError -> ByteString -> Text Source
Decode a ByteString
containing UTF-8 encoded text.
decodeUtf16LEWith :: OnDecodeError -> ByteString -> Text Source
Decode text from little endian UTF-16 encoding.
decodeUtf16BEWith :: OnDecodeError -> ByteString -> Text Source
Decode text from big endian UTF-16 encoding.
decodeUtf32LEWith :: OnDecodeError -> ByteString -> Text Source
Decode text from little endian UTF-32 encoding.
decodeUtf32BEWith :: OnDecodeError -> ByteString -> Text Source
Decode text from big endian UTF-32 encoding.
Encoding Text to ByteStrings
encodeUtf8 :: Text -> ByteString Source
encodeUtf16LE :: Text -> ByteString Source
Encode text using little endian UTF-16 encoding.
encodeUtf16BE :: Text -> ByteString Source
Encode text using big endian UTF-16 encoding.
encodeUtf32LE :: Text -> ByteString Source
Encode text using little endian UTF-32 encoding.
encodeUtf32BE :: Text -> ByteString Source
Encode text using big endian UTF-32 encoding.
Encoding Text using ByteString Builders
encodeUtf8Builder :: Text -> Builder Source