winery: A compact, well-typed seralisation format for Haskell values

[ bsd3, codec, data, library, parsing, program, serialization ] [ Propose Tags ] [ Report a vulnerability ]
Versions [RSS] 0, 0.1, 0.1.1, 0.1.2, 0.2, 0.2.1, 0.3, 0.3.1, 1, 1.0.1, 1.1, 1.1.1, 1.1.2, 1.1.3, 1.2, 1.3, 1.3.1, 1.3.2, 1.4, 1.5
Change log ChangeLog.md
Dependencies aeson, barbies (>=2.0 && <2.1), barbies-th (>=0.1 && <0.2), base (>=4.7 && <5), bytestring, containers, cpu, fast-builder, hashable, HUnit, megaparsec (>=7.0.0), mtl (>=2.3), prettyprinter (>=1.7 && <1.8), prettyprinter-ansi-terminal, QuickCheck, scientific, semigroups, text, time, transformers, unordered-containers, vector, winery [details]
Tested with ghc ==8.4.4, ghc ==8.6.4, ghc ==8.8.3, ghc ==8.10.2
License BSD-3-Clause
Copyright Copyright (c) 2020 Fumiaki Kinoshita
Author Fumiaki Kinoshita
Maintainer fumiexcel@gmail.com
Category Data, Codec, Parsing, Serialization
Home page https://github.com/fumieval/winery#readme
Bug tracker https://github.com/fumieval/winery/issues
Source repo head: git clone https://github.com/fumieval/winery
Uploaded by FumiakiKinoshita at 2024-11-23T02:32:06Z
Distributions
Reverse Dependencies 3 direct, 0 indirect [details]
Executables winery
Downloads 8700 total (71 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2024-11-23 [all 1 reports]

Readme for winery-1.5

[back to package description]

winery

logo Haskell CI Hackage Discord

winery is a serialisation library focusing on performance, compactness and compatibility. The primary feature is that metadata (types, field names, etc) are packed into one schema.

A number of formats, like JSON and CBOR, attach metadata for each value:

[{"id": 0, "name": "Alice"}, {"id": 1, "name": "Bob"}]

In contrast, winery stores them separately, eliminating redundancy while guaranteeing well-typedness:

0402 0402 0269 6410 046e 616d 6514  [{ id :: Integer, name :: Text }]
0200 0541 6c69 6365 0103 426f 62    [(0, "Alice"), (1, "Bob")]

Unlike other libraries that don't preserve metadata (e.g. binary, cereal, store) at all, winery also allows readers to decode values regardless of the current implementation.

Interface

The interface is simple; serialise encodes a value with its schema, and deserialise decodes a ByteString using the schema in it.

class Serialise a where
  schema :: Serialise a => proxy a -> Schema

serialise :: Serialise a => a -> B.ByteString
deserialise :: Serialise a => B.ByteString -> Either WineryException a

It's also possible to serialise schemata and data separately. serialiseSchema encodes a schema and its version number into a ByteString, and serialiseOnly serialises a value without a schema.

serialiseSchema :: Schema -> B.ByteString
serialiseOnly :: Serialise a => a -> B.ByteString

In order to decode data generated this way, pass the result of deserialiseSchema to getDecoder. Finally run evalDecoder to deserialise them.

deserialiseSchema :: B.ByteString -> Either WineryException Schema
getDecoder :: Serialise a => Schema -> Either WineryException (Decoder a)
evalDecoder :: Decoder a -> B.ByteString -> a

Deriving an instance

The recommended way to create an instance of Serialise is to use DerivingVia.

  deriving Generic
  deriving Serialise via WineryRecord Foo

for single-constructor records, or just

  deriving Generic
  deriving Serialise via WineryVariant Foo

for any ADT. The former explicitly describes field names in the schema, and the latter does constructor names.

If you want to customise one of the methods, you can use bundleVia to supply the rest of definitions.

instance Serialise Foo where
  bundleSerialise = bundleVia WineryRecord
  extractor = buildExtractor $ Foo
    <$> extractField "foo"
    <*> extractField "bar"

Backward compatibility

If the representation is not the same as the current version (i.e. the schema is different), the data cannot be decoded directly. This is where extractors come in.

Extractor parses a schema and returns a function which gives a value back from a Term.

You can build an extractor using combinators such as extractField.

If you want to customise an extractor for a variant, the pair of gvariantExtractors and buildVariantExtractors is handy.

gvariantExtractors :: (GSerialiseVariant (Rep a), Generic a) => HM.HashMap T.Text (Extractor a)
buildVariantExtractor :: (Generic a, Typeable a) => HM.HashMap T.Text (Extractor a) -> Extractor a

Extractor is Alternative, meaning that multiple extractors (such as a default generic implementation and fallback plans) can be combined into one.

Altering an instance for a record type is a little bit tricky. HKD can represent a record where each field is Subextractor instead of the orignal type. The barbies-th allows us to derive it from a plain declaration.

import Barbies.Bare
import Barbies.TH

declareBareB [d|
  data HRecB = HRec
    { baz :: !Int
    , qux :: !Text
    }
    |]
type HRec = HRecB Bare Identity

Obtain a record of extractors using bextractors :: forall b. (AllB Serialise b, ...) => b Subextractor, update it as necessary, then build an extractor for an entire record by buildRecordExtractor.

instance Serialise HRec where
  bundleSerialise = bundleVia WineryRecord
  extractor = fmap bstrip $ buildRecordExtractor bextractors
    { qux = extractField "qux" <|> extractField "oldQux" }

More generic instance (for covered types) can be defined as below:

instance (Typeable h, AllBF Serialise h (HRecB Covered)) => Serialise (HRecB Covered h) where
  bundleSerialise = bundleVia Barbie
  extractor = buildRecordExtractorF bextractorsF
    { qux = Compose $ extractField "qux" <|> extractField "oldQux" }

Pretty-printing

Term can be deserialised from any winery data. It can be pretty-printed using the Pretty instance:

{ bar: "hello"
, baz: 3.141592653589793
, foo: Just 42
}

You can use the winery command-line tool to inspect values.

$ winery '.[:10] | .first_name .last_name' benchmarks/data.winery
Shane Plett
Mata Snead
Levon Sammes
Irina Gourlay
Brooks Titlow
Antons Culleton
Regine Emerton
Starlin Laying
Orv Kempshall
Elizabeth Joseff
Cathee Eberz

At the moment, the following queries are supported:

  • . return itself
  • .[] enumerate all the elements in a list
  • .[i] get the i-th element
  • .[i:j] enumerate i-th to j-th items
  • .N n-th element of a product
  • .foo Get a field named foo
  • F | G compose queries (left to right)

Performance

A useful library should also be fast. Benchmarking encoding/decoding of the following datatype.

data Gender = Male | Female

data TestRec = TestRec
  { id_ :: !Int
  , first_name :: !Text
  , last_name :: !Text
  , email :: !Text
  , gender :: !Gender
  , num :: !Int
  , latitude :: !Double
  , longitude :: !Double
  }

Here's the result:

encode 1 encode 1000 decode length
winery 0.28 μs 0.26 ms 0.81 ms 58662
cereal 0.82 μs 0.78 ms 0.90 ms 91709
binary 1.7 μs 1.7 ms 2.0 ms 125709
serialise 0.61 μs 0.50 ms 1.4 ms 65437
store 54 ns 56 μs 0.13 ms 126410
aeson 9.9 μs 9.7 ms 17 ms 160558