Copyright | Gautier DI FOLCO |
---|---|
License | BSD2 |
Maintainer | Gautier DI FOLCO <gautier.difolco@gmail.com> |
Stability | Unstable |
Portability | GHC |
Safe Haskell | None |
Language | Haskell2010 |
Synopsis
- decodeLatin1 :: HasCallStack => Sized s ByteString -> SizedStrictText s
- decodeUtf8 :: Sized s ByteString -> SizedStrictText s
- decodeUtf16LE :: Sized s ByteString -> SizedStrictText s
- decodeUtf16BE :: Sized s ByteString -> SizedStrictText s
- decodeUtf32LE :: Sized s ByteString -> SizedStrictText s
- decodeUtf32BE :: Sized s ByteString -> SizedStrictText s
- decodeUtf8' :: HasCallStack => Sized s ByteString -> Either UnicodeException (SizedStrictText s)
- decodeUtf8With :: HasCallStack => OnDecodeError -> Sized s ByteString -> SizedStrictText s
- decodeUtf16LEWith :: OnDecodeError -> Sized s ByteString -> SizedStrictText s
- decodeUtf16BEWith :: OnDecodeError -> Sized s ByteString -> SizedStrictText s
- decodeUtf32LEWith :: OnDecodeError -> Sized s ByteString -> SizedStrictText s
- decodeUtf32BEWith :: OnDecodeError -> Sized s ByteString -> SizedStrictText s
- streamDecodeUtf8 :: HasCallStack => Sized s ByteString -> Sized s Decoding
- streamDecodeUtf8With :: HasCallStack => OnDecodeError -> Sized s ByteString -> Sized s Decoding
- data Decoding = Some Text ByteString (ByteString -> Decoding)
- encodeUtf8 :: SizedStrictText s -> Sized s ByteString
- encodeUtf16LE :: SizedStrictText s -> Sized s ByteString
- encodeUtf16BE :: SizedStrictText s -> Sized s ByteString
- encodeUtf32LE :: SizedStrictText s -> Sized s ByteString
- encodeUtf32BE :: SizedStrictText s -> Sized s ByteString
- encodeUtf8Builder :: SizedStrictText s -> Sized s Builder
- encodeUtf8BuilderEscaped :: BoundedPrim Word8 -> SizedStrictText s -> Sized s Builder
Decoding ByteStrings to Text
All of the single-parameter functions for decoding bytestrings encoded in one of the Unicode Transformation Formats (UTF) operate in a strict mode: each will throw an exception if given invalid input.
Each function has a variant, whose name is suffixed with -With
,
that gives greater control over the handling of decoding errors.
For instance, decodeUtf8
will throw an exception, but
decodeUtf8With
allows the programmer to determine what to do on a
decoding error.
decodeLatin1 :: HasCallStack => Sized s ByteString -> SizedStrictText s Source #
Decode a ByteString
containing Latin-1 (aka ISO-8859-1) encoded text.
decodeLatin1
is semantically equivalent to
Data.Text.pack . Data.ByteString.Char8.unpack
This is a total function. However, bear in mind that decoding Latin-1 (non-ASCII) characters to UTf-8 requires actual work and is not just buffer copying.
decodeUtf8 :: Sized s ByteString -> SizedStrictText s Source #
Decode a ByteString
containing UTF-8 encoded text that is known
to be valid.
If the input contains any invalid UTF-8 data, an exception will be
thrown that cannot be caught in pure code. For more control over
the handling of invalid data, use decodeUtf8'
or
decodeUtf8With
.
This is a partial function: it checks that input is a well-formed UTF-8 sequence and copies buffer or throws an error otherwise.
decodeUtf16LE :: Sized s ByteString -> SizedStrictText s Source #
Decode text from little endian UTF-16 encoding.
If the input contains any invalid little endian UTF-16 data, an
exception will be thrown. For more control over the handling of
invalid data, use decodeUtf16LEWith
.
decodeUtf16BE :: Sized s ByteString -> SizedStrictText s Source #
Decode text from big endian UTF-16 encoding.
If the input contains any invalid big endian UTF-16 data, an
exception will be thrown. For more control over the handling of
invalid data, use decodeUtf16BEWith
.
decodeUtf32LE :: Sized s ByteString -> SizedStrictText s Source #
Decode text from little endian UTF-32 encoding.
If the input contains any invalid little endian UTF-32 data, an
exception will be thrown. For more control over the handling of
invalid data, use decodeUtf32LEWith
.
decodeUtf32BE :: Sized s ByteString -> SizedStrictText s Source #
Decode text from big endian UTF-32 encoding.
If the input contains any invalid big endian UTF-32 data, an
exception will be thrown. For more control over the handling of
invalid data, use decodeUtf32BEWith
.
Catchable failure
decodeUtf8' :: HasCallStack => Sized s ByteString -> Either UnicodeException (SizedStrictText s) Source #
Decode a ByteString
containing UTF-8 encoded text.
If the input contains any invalid UTF-8 data, the relevant exception will be returned, otherwise the decoded text.
Controllable error handling
decodeUtf8With :: HasCallStack => OnDecodeError -> Sized s ByteString -> SizedStrictText s Source #
Decode a ByteString
containing UTF-8 encoded text.
Surrogate code points in replacement character returned by OnDecodeError
will be automatically remapped to the replacement char U+FFFD
.
decodeUtf16LEWith :: OnDecodeError -> Sized s ByteString -> SizedStrictText s Source #
Decode text from little endian UTF-16 encoding.
decodeUtf16BEWith :: OnDecodeError -> Sized s ByteString -> SizedStrictText s Source #
Decode text from big endian UTF-16 encoding.
decodeUtf32LEWith :: OnDecodeError -> Sized s ByteString -> SizedStrictText s Source #
Decode text from little endian UTF-32 encoding.
decodeUtf32BEWith :: OnDecodeError -> Sized s ByteString -> SizedStrictText s Source #
Decode text from big endian UTF-32 encoding.
Stream oriented decoding
streamDecodeUtf8 :: HasCallStack => Sized s ByteString -> Sized s Decoding Source #
Decode, in a stream oriented way, a ByteString
containing UTF-8
encoded text that is known to be valid.
If the input contains any invalid UTF-8 data, an exception will be
thrown (either by this function or a continuation) that cannot be
caught in pure code. For more control over the handling of invalid
data, use streamDecodeUtf8With
.
streamDecodeUtf8With :: HasCallStack => OnDecodeError -> Sized s ByteString -> Sized s Decoding Source #
Decode, in a stream oriented way, a lazy ByteString
containing UTF-8
encoded text.
A stream oriented decoding result.
Since: text-1.0.0.0
Some Text ByteString (ByteString -> Decoding) |
Encoding Text to ByteStrings
encodeUtf8 :: SizedStrictText s -> Sized s ByteString Source #
Encode text using UTF-8 encoding.
encodeUtf16LE :: SizedStrictText s -> Sized s ByteString Source #
Encode text using little endian UTF-16 encoding.
encodeUtf16BE :: SizedStrictText s -> Sized s ByteString Source #
Encode text using big endian UTF-16 encoding.
encodeUtf32LE :: SizedStrictText s -> Sized s ByteString Source #
Encode text using little endian UTF-32 encoding.
encodeUtf32BE :: SizedStrictText s -> Sized s ByteString Source #
Encode text using big endian UTF-32 encoding.
Encoding Text using ByteString Builders
encodeUtf8Builder :: SizedStrictText s -> Sized s Builder Source #
Encode text to a ByteString Builder
using UTF-8 encoding.
encodeUtf8BuilderEscaped :: BoundedPrim Word8 -> SizedStrictText s -> Sized s Builder Source #
Encode text using UTF-8 encoding and escape the ASCII characters using
a BoundedPrim
.
Use this function is to implement efficient encoders for text-based formats like JSON or HTML.