typed-encoding: Type safe string transformations

[ bsd3, data, library, text ] [ Propose Tags ]

See README.md in the project github repository.


[Skip to Readme]

Modules

[Index] [Quick Jump]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

Versions [RSS] 0.1.0.0, 0.2.0.0, 0.2.1.0, 0.2.2.0, 0.3.0.0, 0.3.0.1, 0.3.0.2, 0.4.0.0, 0.4.1.0, 0.4.2.0, 0.5.0.0, 0.5.1.0, 0.5.2.0, 0.5.2.1, 0.5.2.2, 0.5.2.3
Change log ChangeLog.md
Dependencies base (>=4.10 && <5), base64-bytestring (>=1.0 && <1.3), bytestring (>=0.10 && <0.13), symbols (>=0.3 && <0.3.1), text (>=1.2 && <3) [details]
License BSD-3-Clause
Copyright 2020 Robert Peszek
Author Robert Peszek
Maintainer robpeszek@gmail.com
Category Data, Text
Home page https://github.com/rpeszek/typed-encoding#readme
Bug tracker https://github.com/rpeszek/typed-encoding/issues
Source repo head: git clone https://github.com/rpeszek/typed-encoding
Uploaded by rpeszek at 2023-10-09T20:40:39Z
Distributions
Reverse Dependencies 1 direct, 0 indirect [details]
Downloads 2549 total (41 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs uploaded by user
Build status unknown [no reports yet]

Readme for typed-encoding-0.5.2.3

[back to package description]

typed-encoding

Type level annotations, string transformations, and other goodies that make programming strings safer.

Motivation

I have recently spent a lot of time troubleshooting various Base64, quoted-printable, and UTF-8 encoding issues.
I decided to write a library that will help avoiding issues like these.

This library allows to specify and work with types like

-- some data encoded in base 64
mydata :: Enc '["enc-B64"] c ByteString

-- some text (utf8) data encoded in base 64 
myData :: Enc '["enc-B64", "r-UTF8"] c ByteString

It allows to define precise string content annotations like:

ipaddr :: Enc '["r-IpV4"] c Text

and provides ways for

  • encoding
  • decoding
  • recreation (encoding validation)
  • type conversions
  • converting types to encoded strings
  • typesafe conversion of encoded strings to types

Partial and dangerous things like decodeUtf8 are no longer dangerous, ByteString Text conversions become fully reversible. Life is good!

... but this approach seems to be a bit more...

-- upper cased text encoded as base64
example :: Enc '["enc-B64", "do-UPPER"] () T.Text
example = encodeAll . toEncoding () $ "some text goes here"

It becomes a type directed, declarative approach to string transformations.

Transformations can be

  • used with parameters
  • applied or undone partially (if encoding is reversible)

One of more interesting uses of this library are encoding restrictions included in this library.
Example are (arbitrary) bounded alpha-numeric ("r-ban") restrictions.

-- allow only properly formatted phone numbers

type PhoneSymbol = "r-ban:999-999-9999"
phone :: Enc '[PhoneSymbol] () T.Text
phone = ... 

The author often uses typed-encoding with Servant (HttpApiData instances are not included), e.g.:

type LookupByPhone = 
  "customer"
  :> "byphone"
  :> Capture "phone" (Enc '[PhoneSymbol] () T.Text)
  :> Get '[JSON] ([Customer])

or to get type safety over text document using Unix vs Windows line breaks!

Goals and limitations

The main goal is to provide improved type safety for programs that use string encodings and transformations. Not to provide encoding implementation type safety. Encoding and string manipulation libraries are typically well established and tested, type safety is really needed at the usage site, not at the implementation site.

This library approach is to fight issues with (value level) strings using (type level) strings. Using Symbol-s effectively forces us to play the orphan instances game.
One of the long term goals is for this library to provide combinator alternatives to typeclass polymorphism so that the orphan instances are more of a convenience and not the necessity.

Examples

Here are some code examples:

Hackage

https://hackage.haskell.org/package/typed-encoding

Other encoding packages

My approach will be to write specific encodings (e.g. HTTP) or wrap encodings from other packages using separate "bridge" projects.

Currently /typed-encoding/ depends on

  • /base64-bytestring/ because it was my driving example, this is likely to move out to a separate bridge project at some point.

Bridge work:

Tested with

  • stack (1.9.3) lts-14.27 (ghc-8.6.5)
  • stack (2.5.1) lts-16.27 (ghc-8.8.4)
  • needs ghc >= 8.2.2, base >=4.10 for GHC.TypeLits support

Known issues