Data.BitSyntax

BitSyntax-0.2: A module to aid in the (de)serialisation of binary data

Data.BitSyntax

Contents

Building bit structures
Breaking up bit structures
Utility functions

Description

This module contains fuctions and templates for building up and breaking down packed bit structures. It's something like Erlang's bit-syntax (or, actually, more like Python's struct module).

This code uses Data.ByteString which is included in GHC 6.5 and you can get it for 6.4 at http://www.cse.unsw.edu.au/~dons/fps.html

Synopsis

data BitBlock

= U8 Int

| U16 Int

| U32 Int

| NullTerminated String

| RawString String

| RawByteString ByteString

| PackBits [(Int, Int)]

makeBits :: [BitBlock] -> ByteString

data ReadType

| Skip Integer

| PackedBits [Integer]

| Rest

bitSyn :: [ReadType] -> Q Exp

decodeU8 :: ByteString -> Word8

decodeU16 :: ByteString -> Word16

decodeU32 :: ByteString -> Word32

decodeBits :: [Integer] -> ByteString -> [Integer]

Building bit structures

The core function here is makeBits, which is a perfectly normal function. Here's an example which makes a SOCKS4a request header: makeBits [U8 4, U8 1, U16 80, U32 10, NullTerminated "username", NullTerminated "www.haskell.org"]

data BitBlock

Constructors

U8 Int	Unsigned 8-bit int
U16 Int	Unsigned 16-bit int
U32 Int	Unsigned 32-bit int
NullTerminated String	Appends the string with a trailing NUL byte
RawString String	Appends the string without any terminator
RawByteString ByteString	Appends a ByteString
PackBits [(Int, Int)]	Packs a series of bit fields together. The argument is a list of pairs where the first element is the size (in bits) and the second is the value. The sum of the sizes for a given PackBits must be a multiple of 8

Instances

Show BitBlock

makeBits :: [BitBlock] -> ByteString

Make a binary string from the list of elements given

Breaking up bit structures

The main function for this is bitSyn, which is a template function and so you'll need to run with -fth to enable template haskell http://www.haskell.org/th/. This function expands at the place where its used and includes references to functions by name, so those references need to resolve at the point of use. To make sure that happens you'll need:

 import BitSyntax
 import qualified Data.ByteString as BS

To expand the function you use the splice command: $(bitSyn [...])

The expanded function has type ByteString -> (...) where the elements of the tuple depend of the argument to bitSyn (that's why it has to be a template function).

Heres an example, translated from the Erlang manual, which parses an IP header:

 decodeOptions bs ([_, hlen], _, _, _, _, _, _, _, _, _) =
   if hlen > 5
     then BS.splitAt (fromIntegral ((hlen - 5) * 4)) bs
     else (BS.empty, bs)

 ipDecode = $(bitSyn [PackedBits [4, 4], Unsigned 1, Unsigned 2, Unsigned 2,
                      PackedBits [3, 13], Unsigned 1, Unsigned 1, Unsigned 2,
                      Fixed 4, Fixed 4, Context "decodeOptions", Rest])

 ipPacket = BS.pack [0x45, 0, 0, 0x34, 0xd8, 0xd2, 0x40, 0, 0x40, 0x06,
                     0xa0, 0xca, 0xac, 0x12, 0x68, 0x4d, 0xac, 0x18,
                     0x00, 0xaf]

This function has several weaknesses compared to the Erlang version: The elements of the bit structure are not named in place, instead you have to do a pattern match on the resulting tuple and match up the indexes. The type system helps in this, but it's still not quite as nice.

The need to have the correct functions in scope (as pointed out above) is a problem.

data ReadType

Constructors

Unsigned Integer	An unsigned number of some number of bytes. Valid arguments are 1, 2 and 4
Variable String	A variable length element to be decoded by a custom function. The function's name is given as the single argument and should have type `ByteString -> (v, ByteString)`
Skip Integer	Skip some number of bytes
Fixed Integer	A fixed size field, the result of which is a ByteString of that length.
Ignore ReadType	Decode a value and ignore it (the result will not be part of the returned tuple)
Context String	Like variable, but the decoding function is passed the entire result tuple so far. Thus the function whose name passed has type `ByteString -> (...) -> (v, ByteString)`
LengthPrefixed	Takes the most recent element of the result tuple and interprets it as the length of this field. Results in a ByteString
PackedBits [Integer]	Decode a series of bit fields, results in a list of Integers. Each element of the argument is the length of the bit field. The sums of the lengths must be a multiple of 8
Rest	Results in a ByteString containing the undecoded bytes so far. Generally used at the end to return the trailing body of a structure, it can actually be used at any point in the decoding to return the trailing part at that point.

bitSyn :: [ReadType] -> Q Exp

Utility functions

These are exposed because bitSyn is a template function and so functions referred to by it have to be in scope at the location where bitSyn is used.

decodeU8 :: ByteString -> Word8

decodeU16 :: ByteString -> Word16

decodeU32 :: ByteString -> Word32

decodeBits :: [Integer] -> ByteString -> [Integer]

Produced by Haddock version 0.8