Safe Haskell | None |
---|---|
Language | Haskell2010 |
This module defines the ByteStringUTF8
newtype wrapper around ByteString
, together with its TextualMonoid
instance. The FactorialMonoid
instance of a wrapped ByteStringUTF8
value differs from the original ByteString
:
the prime factors
of the original value are bytes, while UTF8 character byte sequences make up the wrapped value's
prime factors
. The following example session demonstrates the relationship:
> let utf8@(ByteStringUTF8 bs) = fromString "E=mc\xb2" > bs "E=mc\194\178" > factors bs ["E","=","m","c","\194","\178"] > utf8 "E=mc²" > factors utf8 ["E","=","m","c","²"]
The TextualMonoid
instance follows the same logic, but it also decodes all valid UTF8 sequences into
characters. Any invalid UTF8 byte sequence from the original ByteString
is preserved as a single prime factor:
> let utf8'@(ByteStringUTF8 bs') = ByteStringUTF8 (Data.ByteString.map pred bs) > bs' "D<lb\193\177" > factors bs' ["D","<","l","b","\193","\177"] > utf8' "D<lb\[193,177]" > factors utf8' ["D","<","l","b","\[193,177]"]
- newtype ByteStringUTF8 = ByteStringUTF8 ByteString
- decode :: ByteString -> (ByteStringUTF8, ByteString)
Documentation
newtype ByteStringUTF8 Source
decode :: ByteString -> (ByteStringUTF8, ByteString) Source
Takes a raw ByteString
chunk and returns a pair of ByteStringUTF8
decoding the prefix of the chunk and the
remaining suffix that is either null or contains the incomplete last character of the chunk.