Use this library to get an efficient, optimal, type-safe diff
and
patch
function for your datatypes of choice by defining a simple GADT and
some class instances. The process is explained in the documentation of the
Family
type class.
The result of diff
is an optimal EditScript
that contains the operations
that need to be performed to get from the source value to the destination
value. The edit script can be used by patch
, inspected with show
and used
for all kinds of interesting stuff you come up with.
The library has been extracted from Eelco Lempsink's Master's Thesis
Generic type-safe diff and patch for families of datatypes (available online:
http://eelco.lempsink.nl/thesis.pdf). See Appendix C for a large example.
Note that some types and functions have been renamed for the benefit of the API
(e.g., diff
to diffL
, diff1
to diff
, Diff
to EditScriptL
).
- diff :: (Type f x, Type f y) => x -> y -> EditScript f x y
- patch :: EditScript f x y -> x -> y
- diffL :: (Family f, List f txs, List f tys) => txs -> tys -> EditScriptL f txs tys
- patchL :: forall f txs tys. EditScriptL f txs tys -> txs -> tys
- compress :: Family f => EditScriptL f txs tys -> EditScriptL f txs tys
- data EditScriptL where
- Ins :: (Type f t, List f ts, List f tys) => f t ts -> EditScriptL f txs (Append ts tys) -> EditScriptL f txs (Cons t tys)
- Del :: (Type f t, List f ts, List f txs) => f t ts -> EditScriptL f (Append ts txs) tys -> EditScriptL f (Cons t txs) tys
- Cpy :: (Type f t, List f ts, List f txs, List f tys) => f t ts -> EditScriptL f (Append ts txs) (Append ts tys) -> EditScriptL f (Cons t txs) (Cons t tys)
- CpyTree :: (Type f t, List f txs, List f tys) => EditScriptL f txs tys -> EditScriptL f (Cons t txs) (Cons t tys)
- End :: EditScriptL f Nil Nil
- type EditScript f x y = EditScriptL f (Cons x Nil) (Cons y Nil)
- class Family f where
- class Family f => Type f t where
- constructors :: [Con f t]
- data a :=: b where
- data Con where
- data Nil = CNil
- data Cons x xs = CCons x xs
Diffing and patchLing
diff :: (Type f x, Type f y) => x -> y -> EditScript f x ySource
Calculate the difference between two values in the form of an edit script.
See the documentation for Family
for how to make your own data types work
with this function.
patch :: EditScript f x y -> x -> ySource
Apply the edit script to value.
For multiple datatypes
diffL :: (Family f, List f txs, List f tys) => txs -> tys -> EditScriptL f txs tysSource
Underlying implementation of diff
, works with (heterogeneous) lists of
values.
patchL :: forall f txs tys. EditScriptL f txs tys -> txs -> tysSource
Underlying implementation of patch
, works with (heterogeneous) lists of
values.
Patch compression
compress :: Family f => EditScriptL f txs tys -> EditScriptL f txs tysSource
Patch datatype
data EditScriptL whereSource
The EditScriptL
datatype captures the result of diffL
and can be used as the input
to patchL
(and compress
).
The diffL
function use a depth-first preorder traversal to traverse the
expression trees. The edit script it outputs contains the operation that must
be applied to the constructor at that point: either keeping it (Cpy
),
removing it (Del
, which does not remove the complete subtree, but contracts
the edge of the removed node) or inserting a new constructor (Ins
, which can
only be done if the available subtrees at that point correspond to the types
the constructor expects). (The CpyTree
is only used when running compress
over an existing edit script.)
For more information about this datatype, you're advised to look at Eelco Lempsink's thesis at http://eelco.lempsink.nl/thesis.pdf.
Ins :: (Type f t, List f ts, List f tys) => f t ts -> EditScriptL f txs (Append ts tys) -> EditScriptL f txs (Cons t tys) | |
Del :: (Type f t, List f ts, List f txs) => f t ts -> EditScriptL f (Append ts txs) tys -> EditScriptL f (Cons t txs) tys | |
Cpy :: (Type f t, List f ts, List f txs, List f tys) => f t ts -> EditScriptL f (Append ts txs) (Append ts tys) -> EditScriptL f (Cons t txs) (Cons t tys) | |
CpyTree :: (Type f t, List f txs, List f tys) => EditScriptL f txs tys -> EditScriptL f (Cons t txs) (Cons t tys) | |
End :: EditScriptL f Nil Nil |
Show (EditScriptL f txs tys) |
type EditScript f x y = EditScriptL f (Cons x Nil) (Cons y Nil)Source
Edit script type for two single values.
Type classes
To use diff
and patch
on your datatypes, you must create an instance of
Family
.
There are four steps to set up everything for diff
and patch
.
- Define your datatypes. (Presumable, you already have done this.)
- Create a family datatype, grouping your datatypes together.
- Make the family datatype an instance of
Family
- Create
Type
instances for each member of the family.
Steps 1-3 are explained below, step 4 is explained in the documentation for
Type
.
As a running example, we create a simple family of datatypes (step 1):
data Expr = Min Expr Term data Term = Parens Expr | Number Int
The second step is creating the family datatype. Each constructor in the datatypes above gets a constructor in a family GADT:
data ExprTermFamily :: * -> * -> * where Min' :: ExprTermFamily Expr (Cons Expr (Cons Term Nil)) Parens' :: ExprTermFamily Term (Cons Expr Nil) Number' :: ExprTermFamily Term (Cons Int Nil) Int' :: Int -> ExprTermFamily Int Nil
The first type argument of the datatype must be the resulting type of the
constructor. The second argument captures the types of the arguments the
constructor expects. Note how to use Cons
and Nil
to create type level
lists.
The Int'
constructor is special, in the sense that it captures the Int
type
abstractly (listing all Int'
s constructors would be quite impossible).
Caveat emptor: polymorphic datatypes (like lists) are ill-supported and require family constructors for each instance. It might require another master thesis project to solve this.
Step 3 is to create the instance for the Family
class. For each function you
will have to implement four functions. It's straightforward albeit a bit
boring.
instance Family ExprTermFamily where decEq Min' Min' = Just (Refl, Refl) decEq Parens' Parens' = Just (Refl, Refl) decEq Number' Number' = Just (Refl, Refl) decEq (Int' x) (Int' y) | x == y = Just (Refl, Refl) | otherwise = Nothing decEq _ _ = Nothing fields Min' (Min e t) = Just (CCons e (CCons t CNil)) fields Parens' (Parens e) = Just (CCons e CNil) fields Number' (Number i) = Just (CCons i CNil) fields (Int' _) _ = Just CNil fields _ _ = Nothing apply Min' (CCons e (CCons t CNil)) = Min e t apply Parens' (CCons e CNil) = Parens e apply Number' (CCons i CNil) = Number i apply (Int' i) CNil = i string Min' = "Min" string Parens' = "Parens" string Number' = "Number" string (Int' i) = show i
decEq :: f tx txs -> f ty tys -> Maybe (tx :=: ty, txs :=: tys)Source
Return an instance of the equality GADT (Refl
) of the type and
constructor arguments are equal, else return Nothing
.
fields :: f t ts -> t -> Maybe tsSource
When the constructor from the family matches the real
constructor,
return the arguments in a heterogeneous list, else return Nothing
.
apply :: f t ts -> ts -> tSource
When the constructor from the family and the heterogeneous list of
arguments match, apply the real
constructor. For abstract
constructors, the list of arguments should be CNil
, but you project
out the value saved with the family constructor.
string :: f t ts -> StringSource
For show
ing the constructors from the family.
class Family f => Type f t whereSource
For each type in the family, you need to create an instance of Type
, listing
all the members of the family GADT which belong to one type.
This is step 4 of the four steps needed to be able to use diff
and patch
.
Steps 1-3 are explained in the documentation for Family
.
Continuing the example from the Family
documentation, the instances for
Type
are:
instance Type ExprTermFamily Term where constructors = [Concr Number', Concr Parens'] instance Type ExprTermFamily Expr where constructors = [Concr Min'] instance Type ExprTermFamily Int where constructors = [Abstr Int']
constructors :: [Con f t]Source
List of constructors from the family GADT for one type in your family
Supporting datatypes
Equality GADT, see the documentation for Family
for an example of its use.
Con
wraps both concrete and abstract constructors to a simple type so
constructors for a single type can be put together in a list, see Type
for
more information and an example.
Use Concr
for concrete constructors (e.g., for simple abstract datatypes).
Use Abstr
for abstract constructors (e.g., for build-in types or types with many
(or infinite) constructors)