Safe Haskell | Safe |
---|---|
Language | Haskell2010 |
Since: 0.2.2.0
Synopsis
- decodeUtf8 :: forall xs c t y ys encs. (UnSnoc xs ~ (,) ys y, Superset "r-UTF8" y, encs ~ RemoveRs ys, AllEncodeInto "r-UTF8" encs) => Enc xs c ByteString -> Enc xs c Text
- encodeUtf8 :: forall xs c t y ys encs. (UnSnoc xs ~ (,) ys y, Superset "r-UTF8" y, encs ~ RemoveRs ys, AllEncodeInto "r-UTF8" encs) => Enc xs c Text -> Enc xs c ByteString
Documentation
>>>
:set -XScopedTypeVariables -XOverloadedStrings -XDataKinds -XFlexibleContexts -XTypeApplications
>>>
import Test.QuickCheck
>>>
import Test.QuickCheck.Instances.Text()
>>>
import Test.QuickCheck.Instances.ByteString()
>>>
import qualified Data.ByteString.Char8 as B8
>>>
import Data.Char
>>>
import Data.Either
>>>
import Data.TypedEncoding
>>>
import Data.TypedEncoding.Conv.Text
>>>
let emptyUTF8B = unsafeSetPayload () "" :: Enc '["r-UTF8"] () B.ByteString
>>>
:{
instance Arbitrary (Enc '["r-UTF8"] () B.ByteString) where arbitrary = fmap (fromRight emptyUTF8B) . flip suchThat isRight . fmap (encodeFAll @'["r-UTF8"] @(Either EncodeEx) @(). toEncoding ()) $ arbitrary instance Arbitrary (Enc '["r-UTF8"] () T.Text) where arbitrary = fmap (unsafeSetPayload ()) arbitrary instance Arbitrary (Enc '["r-ASCII"] () B.ByteString) where arbitrary = fmap (unsafeSetPayload ()) . flip suchThat (B8.all isAscii) $ arbitrary instance Arbitrary (Enc '["r-ASCII"] () T.Text) where arbitrary = fmap (unsafeSetPayload ()) . flip suchThat (T.all isAscii) $ arbitrary :}
decodeUtf8 :: forall xs c t y ys encs. (UnSnoc xs ~ (,) ys y, Superset "r-UTF8" y, encs ~ RemoveRs ys, AllEncodeInto "r-UTF8" encs) => Enc xs c ByteString -> Enc xs c Text Source #
With given constraints decodeUtf8
and encodeUtf8
can be used on subsets of "r-UTF8"
Note: For example, the ByteString
encoding of "xd800"
(11101101 10100000 10000000
ed a0 80
) is considered invalid UTF8 by the Text
library
To be consistent we make the same assumption of also restricting representable Unicode chars as in Unicode.D76.
>>>
TE.decodeUtf8 "\237\160\128"
"*** Exception: Cannot decode byte '\xed': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream
The "xdfff" case (11101101 10111111 10111111
ed bf bf
):
>>> TE.decodeUtf8 "237191191"
"*** Exception: Cannot decode byte '\xed': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream
>>>
displ . decodeUtf8 $ (unsafeSetPayload () "Hello" :: Enc '["r-ASCII"] () B.ByteString)
"Enc '[r-ASCII] () (Text Hello)"
"r-UTF8" is redundant:
>>>
displ . utf8Demote . decodeUtf8 $ (unsafeSetPayload () "Hello" :: Enc '["r-UTF8"] () B.ByteString)
"Enc '[] () (Text Hello)"
decodeUtf8
and encodeUtf8
form isomorphism
\x -> getPayload x == (getPayload . encodeUtf8 . decodeUtf8 @ '["r-UTF8"] @() $ x)
\x -> getPayload x == (getPayload . decodeUtf8 . encodeUtf8 @ '["r-UTF8"] @() $ x)
These nicely work as iso's for "r-ASCII" subset
\x -> getPayload x == (getPayload . encodeUtf8 . decodeUtf8 @ '["r-ASCII"] @() $ x)
\x -> getPayload x == (getPayload . decodeUtf8 . encodeUtf8 @ '["r-ASCII"] @() $ x)
Similarly to pack
this function makes unverified assumption
that the encoding stack xs
does invalidate UTF8 byte layout. This is safe for any "r-" encoding as well
as any of the "enc-" and "do-" encodings that can be currently found in this library.
Future versions of this method are likely to introduce constraints that guarantee better type safety.
See Data.TypedEncoding.Conv for more detailed discussion.
Since: 0.4.0.0
encodeUtf8 :: forall xs c t y ys encs. (UnSnoc xs ~ (,) ys y, Superset "r-UTF8" y, encs ~ RemoveRs ys, AllEncodeInto "r-UTF8" encs) => Enc xs c Text -> Enc xs c ByteString Source #
>>>
displ $ encodeUtf8 $ utf8Promote $ toEncoding () ("text" :: T.Text)
"Enc '[r-UTF8] () (ByteString text)"
See decodeUtf8
. Similar type safety concerns apply.
See Data.TypedEncoding.Conv for more detailed discussion.
Since: 0.4.0.0