Safe Haskell | None |
---|---|
Language | Haskell2010 |
ArithEncode is a library that provides tools for defining arithmetic encodings for arbitrary datatypes. The library is designed so that multiple encoding schemes can be defined for a given datatype, and a given encoding need not encode all possible instances of the datatype.
An Encoding
is an object which is passed as the first argument
to seven different functions. The primary function of an
Encoding
is manifest in the encode
and decode
functions,
which define an isomorphism between the datatype and the natural
numbers (or a bounded set thereof), represented using Integer
s.
The encode
and decode
functions have the following properties:
decode enc (encode enc v) == v
for all valuesv
in the domainencode enc v == encode enc w
only ifw == v
decode enc n == decode enc m
only ifn == m
The inDomain
function indicates whether or not a given value is
in the domain of the encoding. Passing a value v
where inDomain
enc v == False
into any other function may result in an
IllegalArgument
exception. (For performance reasons, encodings
are not strictly required to throw IllegalArgument
, but the
result should not be considered valid if they do not throw the
exception).
This library provides a large collection of combinators for
constructing more complex Encoding
s out of simpler ones. The
provided combinators should be appropriate for constructing
Encoding
s for most datatypes.
As an example, the following definition creates an Encoding
for
the Tree Integer
type:
tree :: Encoding (Tree Integer) tree = let ... nodeEncoding nodeenc = wrap unmakeNode makeNode (pair interval (seq nodeenc)) in recursive nodeEncoding
In this example, the makeNode
and unmakeNode
functios are
simply "glue"; their definitions are
makeNode (label, children) = Node { rootLabel = label, subForest = children } unmakeNode Node { rootLabel = label, subForest = children } = Just (label, children)
The resulting Encoding
maps any Tree Integer
to a unique
Integer
value.
Encoding
s have a number of practical uses. First, all
Encoding
s in this library satisfy a completeness property,
which guarantees that they map each value to a finite natural
number (or in the case of constructions on Encoding
s, they
preserve completeness). Hence, they can be used as an enumeration
procedure for their underlying datatype.
Second, as Encoding
s define an isomorphism to the natural
numbers, they provide an efficient binary encode/decode procedure
in theory. In practice, the techniques used to guarantee the
completeness property may result in long encodings of some types
(particularly sequences). Also, knowledge of the distribution of
the domain is necessary in order to achieve the most succinct
possible encoding.