biohazard-0.6.5: bioinformatics support library

Safe HaskellNone





class Avro a where Source

Support for Avro. Current status is that we can generate schemas for certain Haskell values, serialize to binary and JSON representations, and write Container files using the null codec. The C implementation likes some, but not all of these containers; it's unclear if that's the fault of the C implementation, though.

Meanwhile, serialization works for nested sums-of-products, as long as the product uses record syntax and the top level is a plain record. The obvious primitives are supported.

This is the class of types we can embed into the Avro infrastructure. Right now, we can derive a schema, encode to the Avro binary format, and encode to the Avro JSON encoding.


toSchema :: a -> MkSchema Value Source

Produces the schema for this type. Schemas are represented as JSON values. The monad is used to keep a table of already defined types, so the schema can refer to them by name. (The concrete argument serves to specify the type, it is not actually used.)

toBin :: a -> Builder Source

Serializes a value to the binary representation. The schema is implied, serialization to related schemas is not supported.

fromBin :: Get a Source

Deserializzes a value from binary representation. Right now, no attempt at schema matching is done, the schema must match the expected one exactly.

toAvron :: a -> Value Source

Serializes a value to the JSON representation. Note that even the JSON format needs a schema for successful deserialization, and here we support only the one implied schema.


Avro Bool Source 
Avro Double Source 
Avro Float Source 
Avro Int Source 
Avro Int64 Source 
Avro Word8 Source 
Avro () Source

The Avro "null" type is represented as the empty tuple.

Avro ByteString Source 
Avro Text Source 
Avro GenoCallSite Source 
Avro GenoCallBlock Source 
Avro a => Avro [a] Source

A list becomes an Avro array The chunked encoding for lists may come in handy. How to select the chunk size is not obvious, though.

Avro a => Avro (Vector a) Source

A generic vector becomes an Avro array

(Avro a, Unbox a) => Avro (Vector a) Source

An unboxed vector becomes an Avro array

Avro a => Avro (HashMap Text a) Source

A map from Text becomes an Avro map.

newtype MkSchema a Source

Making schemas requires a memo table of type definitions.




mkSchema :: (a -> HashMap Text Value -> Value) -> HashMap Text Value -> Value

cast :: (Storable a, Storable b) => a -> b Source

zig :: (Storable a, Bits a) => a -> a Source

Implements Zig-Zag-Coding like in Protocol Buffers and Avro.

zag :: (Storable a, Bits a, Num a) => a -> a Source

Reverses Zig-Zag-Coding like in Protocol Buffers and Avro.

encodeWordBase128 :: (Integral a, Bits a) => a -> Builder Source

Encodes a word of any size using a variable length "base 128" encoding.

encodeIntBase128 :: (Integral a, Bits a, Storable a) => a -> Builder Source

Encodes an int of any size by combining the zig-zag coding with the base 128 encoding.

decodeIntBase128 :: (Integral a, Bits a, Storable a) => Get a Source

Decodes an int of any size by combining the zig-zag decoding with the base 128 decoding.

Some(!) complex types.

type AvroMeta = HashMap Text ByteString Source

Avro Meta Data is currently unprocessed. Contains the codec, the schema, a version number.

readAvroContainer :: (Monad m, Avro a) => Enumeratee' AvroMeta ByteString [a] m r Source

Decodes an AVRO container file into a list. Meta data is passed on. Note that if this blows up, it's usually due to it being applied at the wrong type. Be sure to correctly count the brackets...

XXX Possible codecs: null, zlib, snappy, lzma; all missing XXX Should check schema on reading.

findSchema :: Text -> AvroMeta -> Value Source

Finds a names schema from the meta data of an Avro container.