Safe Haskell | None |
---|---|
Language | Haskell2010 |
This module defines the ByteStringUTF8
newtype wrapper around ByteString
, together with its TextualMonoid
instance. The FactorialMonoid
instance of a wrapped ByteStringUTF8
value differs from the original ByteString
:
the prime factors
of the original value are its bytes, and for the wrapped value the prime factors
are its valid
UTF8 byte sequences. The following example session demonstrates the relationship:
> let utf8@(ByteStringUTF8 bs) = fromString "E=mc\xb2" > bs "E=mc\194\178" > factors bs ["E","=","m","c","\194","\178"] > utf8 "E=mc²" > factors utf8 ["E","=","m","c","²"]
The TextualMonoid
instance follows the same logic, but it also decodes all valid UTF8 sequences into
characters. Any invalid UTF8 byte sequence from the original ByteString
is preserved as a single prime factor:
> let utf8'@(ByteStringUTF8 bs') = ByteStringUTF8 (Data.ByteString.map pred bs) > bs' "D<lb\193\177" > factors bs' ["D","<","l","b","\193","\177"] > utf8' "D<lb\[193,177]" > factors utf8' ["D","<","l","b","\[193,177]"]
- newtype ByteStringUTF8 = ByteStringUTF8 ByteString
- decode :: ByteString -> (ByteStringUTF8, ByteString)
Documentation
newtype ByteStringUTF8 Source
decode :: ByteString -> (ByteStringUTF8, ByteString) Source
Takes a raw ByteString
chunk and returns a pair of ByteStringUTF8
decoding the prefix of the chunk and the
remaining suffix that is either null or contains the incomplete last character of the chunk.