safecopy-0.9.2: Binary serialization with version control.

CopyrightPublicDomain
Maintainerlemmih@gmail.com
Portabilitynon-portable (uses GHC extensions)
Safe HaskellTrustworthy
LanguageHaskell98

Data.SafeCopy

Contents

Description

SafeCopy extends the parsing and serialization capabilities of Data.Serialize to include nested version control. Nested version control means that you can change the definition and binary format of a type nested deep within other types without problems.

Consider this scenario. You want to store your contact list on disk and so write the following code:

type Name     = String
type Address  = String
data Contacts = Contacts [(Name, Address)]
instance SafeCopy Contacts where
     putCopy (Contacts list) = contain $ safePut list
     getCopy = contain $ Contacts <$> safeGet
 

At this point, everything is fine. You get the awesome speed of Data.Serialize together with Haskell's ease of use. However, things quickly take a U-turn for the worse when you realize that you want to keep phone numbers as well as names and addresses. Being the experienced coder that you are, you see that using a 3-tuple isn't very pretty and you'd rather use a record. At first you fear that this change in structure will invalidate all your old data. Those fears are quickly quelled, though, when you remember how nifty SafeCopy is. With renewed enthusiasm, you set out and write the following code:

type Name = String
type Address = String
type Phone = String

{- We rename our old Contacts structure -}
data Contacts_v0 = Contacts_v0 [(Name, Address)]
instance SafeCopy Contacts_v0 where
     putCopy (Contacts_v0 list) = contain $ safePut list
     getCopy = contain $ Contacts_v0 <$> safeGet

data Contact = Contact { name    :: Name
                        , address :: Address
                        , phone   :: Phone }
instance SafeCopy Contact where
    putCopy Contact{..} = contain $ do safePut name; safePut address; safePut phone
    getCopy = contain $ Contact <$> safeGet <*> safeGet <*> safeGet

data Contacts = Contacts [Contact]
instance SafeCopy Contacts where
     version = 2
     kind = extension
     putCopy (Contacts contacts) = contain $ safePut contacts
     getCopy = contain $ Contacts <$> safeGet

{- Here the magic happens: -}
instance Migrate Contacts where
     type MigrateFrom Contacts = Contacts_v0
     migrate (Contacts_v0 contacts) = Contacts [ Contact{ name    = name
                                                        , address = address
                                                        , phone   = "" }
                                               | (name, address) <- contacts ]
 

With this, you reflect on your code and you are happy. You feel confident in the safety of your data and you know you can remove Contacts_v0 once you no longer wish to support that legacy format.

Synopsis

Documentation

safeGet :: SafeCopy a => Get a Source #

Parse a version tagged data type and then migrate it to the desired type. Any serialized value has been extended by the return type can be parsed.

safePut :: SafeCopy a => a -> Put Source #

Serialize a data type by first writing out its version tag. This is much simpler than the corresponding safeGet since previous versions don't come into play.

class SafeCopy a where Source #

The centerpiece of this library. Defines a version for a data type together with how it should be serialized/parsed.

Users should define instances of SafeCopy for their types even though getCopy and putCopy can't be used directly. To serialize/parse a data type using SafeCopy, see safeGet and safePut.

Methods

version :: Version a Source #

The version of the type.

Only used as a key so it must be unique (this is checked at run-time) but doesn't have to be sequential or continuous.

The default version is '0'.

kind :: Kind a Source #

The kind specifies how versions are dealt with. By default, values are tagged with their version id and don't have any previous versions. See extension and the much less used primitive.

getCopy :: Contained (Get a) Source #

This method defines how a value should be parsed without also worrying about writing out the version tag. This function cannot be used directly. One should use safeGet, instead.

putCopy :: a -> Contained Put Source #

This method defines how a value should be parsed without worrying about previous versions or migrations. This function cannot be used directly. One should use safeGet, instead.

objectProfile :: Profile a Source #

Version profile.

errorTypeName :: Proxy a -> String Source #

The name of the type. This is only used in error message strings. Feel free to leave undefined in your instances.

getCopy :: Serialize a => Contained (Get a) Source #

This method defines how a value should be parsed without also worrying about writing out the version tag. This function cannot be used directly. One should use safeGet, instead.

putCopy :: Serialize a => a -> Contained Put Source #

This method defines how a value should be parsed without worrying about previous versions or migrations. This function cannot be used directly. One should use safeGet, instead.

newtype Prim a Source #

Wrapper for data that was saved without a version tag.

Constructors

Prim 

Fields

class SafeCopy (MigrateFrom a) => Migrate a where Source #

The central mechanism for dealing with version control.

This type class specifies what data migrations can happen and how they happen.

Minimal complete definition

migrate

Associated Types

type MigrateFrom a Source #

This is the type we're extending. Each type capable of migration can only extend one other type.

Methods

migrate :: MigrateFrom a -> a Source #

This method specifies how to migrate from the older type to the newer one. It will never be necessary to use this function manually as it all taken care of internally in the library.

newtype Reverse a Source #

This is a wrapper type used migrating backwards in the chain of compatible types.

Constructors

Reverse 

Fields

data Kind a Source #

The kind of a data type determines how it is tagged (if at all).

Primitives kinds (see primitive) are not tagged with a version id and hence cannot be extended later.

Extensions (see extension) tells the system that there exists a previous version of the data type which should be migrated if needed.

There is also a default kind which is neither primitive nor is an extension of a previous type.

extension :: (SafeCopy a, Migrate a) => Kind a Source #

The extension kind lets the system know that there is at least one previous version of this type. A given data type can only extend a single other data type. However, it is perfectly fine to build chains of extensions. The migrations between each step is handled automatically.

extended_extension :: (SafeCopy a, Migrate a, Migrate (Reverse a)) => Kind a Source #

The extended_base kind lets the system know that there is at least one future version of this type.

extended_base :: Migrate (Reverse a) => Kind a Source #

The extended_base kind lets the system know that there is at least one future version of this type.

base :: Kind a Source #

The default kind. Does not extend any type.

data Contained a Source #

To ensure that no-one reads or writes values without handling versions correct, it is necessary to restrict access to getCopy and putCopy. This is where Contained enters the picture. It allows you to put values in to a container but not to take them out again.

contain :: a -> Contained a Source #

Place a value in an unbreakable container.

data Version a Source #

A simple numeric version id.

Instances

Eq (Version a) Source # 

Methods

(==) :: Version a -> Version a -> Bool #

(/=) :: Version a -> Version a -> Bool #

Num (Version a) Source # 

Methods

(+) :: Version a -> Version a -> Version a #

(-) :: Version a -> Version a -> Version a #

(*) :: Version a -> Version a -> Version a #

negate :: Version a -> Version a #

abs :: Version a -> Version a #

signum :: Version a -> Version a #

fromInteger :: Integer -> Version a #

Read (Version a) Source # 
Show (Version a) Source # 

Methods

showsPrec :: Int -> Version a -> ShowS #

show :: Version a -> String #

showList :: [Version a] -> ShowS #

Serialize (Version a) Source # 

Methods

put :: Putter (Version a) #

get :: Get (Version a) #

Template haskell functions

deriveSafeCopy :: Version a -> Name -> Name -> Q [Dec] Source #

Derive an instance of SafeCopy.

When serializing, we put a Word8 describing the constructor (if the data type has more than one constructor). For each type used in the constructor, we call getSafePut (which immediately serializes the version of the type). Then, for each field in the constructor, we use one of the put functions obtained in the last step.

For example, given the data type and the declaration below

data T0 b = T0 b Int
deriveSafeCopy 1 'base ''T0
   

we generate

instance (SafeCopy a, SafeCopy b) =>
         SafeCopy (T0 b) where
    putCopy (T0 arg1 arg2) = contain $ do put_b   <- getSafePut
                                          put_Int <- getSafePut
                                          put_b   arg1
                                          put_Int arg2
                                          return ()
    getCopy = contain $ do get_b   <- getSafeGet
                           get_Int <- getSafeGet
                           return T0 <*> get_b <*> get_Int
    version = 1
    kind = base
   

And, should we create another data type as a newer version of T0, such as

data T a b = C a a | D b Int
deriveSafeCopy 2 'extension ''T

instance SafeCopy b => Migrate (T a b) where
  type MigrateFrom (T a b) = T0 b
  migrate (T0 b i) = D b i
   

we generate

instance (SafeCopy a, SafeCopy b) =>
         SafeCopy (T a b) where
    putCopy (C arg1 arg2) = contain $ do putWord8 0
                                         put_a <- getSafePut
                                         put_a arg1
                                         put_a arg2
                                         return ()
    putCopy (D arg1 arg2) = contain $ do putWord8 1
                                         put_b   <- getSafePut
                                         put_Int <- getSafePut
                                         put_b   arg1
                                         put_Int arg2
                                         return ()
    getCopy = contain $ do tag <- getWord8
                           case tag of
                             0 -> do get_a <- getSafeGet
                                     return C <*> get_a <*> get_a
                             1 -> do get_b   <- getSafeGet
                                     get_Int <- getSafeGet
                                     return D <*> get_b <*> get_Int
                             _ -> fail $ "Could not identify tag \"" ++
                                         show tag ++ "\" for type Main.T " ++
                                         "that has only 2 constructors.  " ++
                                         "Maybe your data is corrupted?"
    version = 2
    kind = extension
   

Note that by using getSafePut, we saved 4 bytes in the case of the C constructor. For D and T0, we didn't save anything. The instance derived by this function always use at most the same space as those generated by deriveSafeCopySimple, but never more (as we don't call 'getSafePut'/'getSafeGet' for types that aren't needed).

Note that you may use deriveSafeCopySimple with one version of your data type and deriveSafeCopy in another version without any problems.

deriveSafeCopySimple :: Version a -> Name -> Name -> Q [Dec] Source #

Derive an instance of SafeCopy. The instance derived by this function is simpler than the one derived by deriveSafeCopy in that we always use safePut and safeGet (instead of getSafePut and getSafeGet).

When serializing, we put a Word8 describing the constructor (if the data type has more than one constructor) and, for each field of the constructor, we use safePut.

For example, given the data type and the declaration below

data T a b = C a a | D b Int
deriveSafeCopySimple 1 'base ''T
   

we generate

instance (SafeCopy a, SafeCopy b) =>
         SafeCopy (T a b) where
    putCopy (C arg1 arg2) = contain $ do putWord8 0
                                         safePut arg1
                                         safePut arg2
                                         return ()
    putCopy (D arg1 arg2) = contain $ do putWord8 1
                                         safePut arg1
                                         safePut arg2
                                         return ()
    getCopy = contain $ do tag <- getWord8
                           case tag of
                             0 -> do return C <*> safeGet <*> safeGet
                             1 -> do return D <*> safeGet <*> safeGet
                             _ -> fail $ "Could not identify tag \"" ++
                                         show tag ++ "\" for type Main.T " ++
                                         "that has only 2 constructors.  " ++
                                         "Maybe your data is corrupted?"
    version = 1
    kind = base
   

Using this simpler instance means that you may spend more bytes when serializing data. On the other hand, it is more straightforward and may match any other format you used in the past.

Note that you may use deriveSafeCopy with one version of your data type and deriveSafeCopySimple in another version without any problems.

deriveSafeCopyHappstackData :: Version a -> Name -> Name -> Q [Dec] Source #

Derive an instance of SafeCopy. The instance derived by this function should be compatible with the instance derived by the module Happstack.Data.SerializeTH of the happstack-data package. The instances use only safePut and safeGet (as do the instances created by deriveSafeCopySimple), but we also always write a Word8 tag, even if the data type isn't a sum type.

For example, given the data type and the declaration below

data T0 b = T0 b Int
deriveSafeCopy 1 'base ''T0
   

we generate

instance (SafeCopy a, SafeCopy b) =>
         SafeCopy (T0 b) where
    putCopy (T0 arg1 arg2) = contain $ do putWord8 0
                                          safePut arg1
                                          safePut arg2
                                          return ()
    getCopy = contain $ do tag <- getWord8
                           case tag of
                             0 -> do return T0 <*> safeGet <*> safeGet
                             _ -> fail $ "Could not identify tag \"" ++
                                         show tag ++ "\" for type Main.T0 " ++
                                         "that has only 1 constructors.  " ++
                                         "Maybe your data is corrupted?"
    version = 1
    kind = base
   

This instance always consumes at least the same space as deriveSafeCopy or deriveSafeCopySimple, but may use more because of the useless tag. So we recomend using it only if you really need to read a previous version in this format, and not for newer versions.

Note that you may use deriveSafeCopy with one version of your data type and deriveSafeCopyHappstackData in another version without any problems.

Rarely used functions

getSafeGet :: forall a. SafeCopy a => Get (Get a) Source #

Parse a version tag and return the corresponding migrated parser. This is useful when you can prove that multiple values have the same version. See getSafePut.

getSafePut :: forall a. SafeCopy a => PutM (a -> Put) Source #

Serialize the version tag and return the associated putter. This is useful when serializing multiple values with the same version. See getSafeGet.

primitive :: Kind a Source #

Primitive kinds aren't version tagged. This kind is used for small or built-in types that won't change such as Int or Bool.