compact-0.2.0.0: Non-GC'd, contiguous storage for immutable data structures

Safe HaskellNone
LanguageHaskell2010

Data.Compact

Contents

Synopsis

The Compact type

data Compact a #

A Compact contains fully evaluated, pure, immutable data.

Compact serves two purposes:

  • Data stored in a Compact has no garbage collection overhead. The garbage collector considers the whole Compact to be alive if there is a reference to any object within it.
  • A Compact can be serialized, stored, and deserialized again. The serialized data can only be deserialized by the exact binary that created it, but it can be stored indefinitely before deserialization.

Compacts are self-contained, so compacting data involves copying it; if you have data that lives in two Compacts, each will have a separate copy of the data.

The cost of compaction is similar to the cost of GC for the same data, but it is performed only once. However, because "GHC.Compact.compact" does not stop-the-world, retaining internal sharing during the compaction process is very costly. The user can choose whether to compact or compactWithSharing.

When you have a Compact a, you can get a pointer to the actual object in the region using "GHC.Compact.getCompact". The Compact type serves as handle on the region itself; you can use this handle to add data to a specific Compact with compactAdd or compactAddWithSharing (giving you a new handle which corresponds to the same compact region, but points to the newly added object in the region). At the moment, due to technical reasons, it's not possible to get the Compact a if you only have an a, so make sure you hold on to the handle as necessary.

Data in a compact doesn't ever move, so compacting data is also a way to pin arbitrary data structures in memory.

There are some limitations on what can be compacted:

  • Functions. Compaction only applies to data.
  • Pinned ByteArray# objects cannot be compacted. This is for a good reason: the memory is pinned so that it can be referenced by address (the address might be stored in a C data structure, for example), so we can't make a copy of it to store in the Compact.
  • Objects with mutable pointer fields (e.g. IORef, MutableArray) also cannot be compacted, because subsequent mutation would destroy the property that a compact is self-contained.

If compaction encounters any of the above, a CompactionFailed exception will be thrown by the compaction operation.

Compacting data

compact :: a -> IO (Compact a) #

Compact a value. O(size of unshared data)

If the structure contains any internal sharing, the shared data will be duplicated during the compaction process. This will not terminate if the structure contains cycles (use compactWithSharing instead).

The object in question must not contain any functions or data with mutable pointers; if it does, compact will raise an exception. In the future, we may add a type class which will help statically check if this is the case or not.

compactWithSharing :: a -> IO (Compact a) #

Compact a value, retaining any internal sharing and cycles. O(size of data)

This is typically about 10x slower than compact, because it works by maintaining a hash table mapping uncompacted objects to compacted objects.

The object in question must not contain any functions or data with mutable pointers; if it does, compact will raise an exception. In the future, we may add a type class which will help statically check if this is the case or not.

compactAdd :: Compact b -> a -> IO (Compact a) #

Add a value to an existing Compact. This will help you avoid copying when the value contains pointers into the compact region, but remember that after compaction this value will only be deallocated with the entire compact region.

Behaves exactly like compact with respect to sharing and what data it accepts.

compactAddWithSharing :: Compact b -> a -> IO (Compact a) #

Add a value to an existing Compact, like compactAdd, but behaving exactly like compactWithSharing with respect to sharing and what data it accepts.

compactSized :: Int -> Bool -> a -> IO (Compact a) #

Transfer a into a new compact region, with a preallocated size, possibly preserving sharing or not. If you know how big the data structure in question is, you can save time by picking an appropriate block size for the compact region.

Inspecting a Compact

getCompact :: Compact a -> a #

Retrieve a direct pointer to the value pointed at by a Compact reference. If you have used compactAdd, there may be multiple Compact references into the same compact region. Upholds the property:

inCompact c (getCompact c) == True

inCompact :: Compact b -> a -> IO Bool #

Check if the second argument is inside the passed Compact.

isCompact :: a -> IO Bool #

Check if the argument is in any Compact. If true, the value in question is also fully evaluated, since any value in a compact region must be fully evaluated.

compactSize :: Compact a -> IO Word #

Returns the size in bytes of the compact region.