judy-0.4.1: Fast, scalable, mutable dynamic arrays, maps and hashes

Safe HaskellNone
LanguageHaskell2010

Data.Judy

Contents

Description

Very fast, mutable associative data types based on Judy arrays.

A good imperative, mutable replacement for IntMap.

Judy arrays are both speed- and memory-efficient, with no tuning or configuration required, across a wide range of index set types (sequential, periodic, clustered, random). Judy's speed and memory usage are typically better than other data storage models such as skiplists, linked lists, binary, ternary, b-trees, or even hashing, and improves with very large data sets.

The memory used by a Judy array is nearly proportional to the population (number of elements).

For further references to the implementation, see:

Examples:

Generate 1 million random integers. Report the largest one we see.

import System.Random.Mersenne
import qualified Data.Judy as J
import Control.Monad

main = do
   g  <- getStdGen
   rs <- randoms g
   j  <- J.new :: IO (J.JudyL Int)
   forM_ (take 1000000 rs) $ \n ->
       J.insert n 1 j
   v  <- J.findMax j
   case v of
        Nothing    -> print "Done."
        Just (k,_) -> print k

Compile it:

$ ghc -O2 --make Test.hs

Running it:

$ time ./Test
18446712059962695226
./Test  0.65s user 0.03s system 99% cpu 0.680 total

Notes:

  • By default this library is threadsafe.
  • Multiple Haskell threads may operate on the arrays simultaneously. You can compile without locks if you know you're running in a single threaded fashion with: cabal install -funsafe

Sun Sep 27 17:12:24 PDT 2009: The library has only lightly been tested.

Synopsis

Basic types

data JudyL a Source #

A JudyL array is a mutable, finite map from Word to Word values. It is threadsafe by default.

A value is addressed by a key. The array may be sparse, and the key may be any word-sized value. There are no duplicate keys.

Values may be any instance of the JE class.

Instances

Show (JudyL a) Source # 

Methods

showsPrec :: Int -> JudyL a -> ShowS #

show :: JudyL a -> String #

showList :: [JudyL a] -> ShowS #

type Key = Word Source #

The type of keys in the JudyL arrays. A word-sized type (64 or 32 bits)

Construction

new :: JE a => IO (JudyL a) Source #

Allocate a new empty JudyL array.

A finalizer is associated with the JudyL array, that will cause the garbage collector to free it automatically once the last reference has been dropped on the Haskell side.

Note: The Haskell GC will track references to the foreign resource, but the foreign resource won't exert any heap pressure on the GC, meaning that finalizers will be run much later than you expect. An explicit performGC can help with this.

Note: that if you store pointers in the Judy array we have no way of deallocating those -- you'll need to track those yourself (e.g. via StableName or ForeignPtr)

Queries

null :: JudyL a -> IO Bool Source #

O(1), null. Is the map empty?

size :: JudyL a -> IO Int Source #

O(1), size. The number of elements in the map.

member :: Key -> JudyL a -> IO Bool Source #

Is the key a member of the map?

lookup :: JE a => Key -> JudyL a -> IO (Maybe a) Source #

Lookup a value associated with a key in the JudyL array. Return Nothing if no value is found.

Insertion and removal

insert :: JE a => Key -> a -> JudyL a -> IO () Source #

Insert a key and value pair into the JudyL array. Any existing key will be overwritten.

insertWith :: JE a => (a -> a -> a) -> Key -> a -> JudyL a -> IO () Source #

Insert with a function, combining new value and old value.

  • If the key does not exist in the map, the value will be inserted.
  • If the key does exist, the combining function will be applied: f new old

delete :: Key -> JudyL a -> IO () Source #

Delete the Index/Value pair from the JudyL array.

adjust :: JE a => (a -> a) -> Key -> JudyL a -> IO () Source #

Update a value at a specific key with the result of the provided function. When the key is not a member of the map, no change is made.

Min/Max

findMin :: JE a => JudyL a -> IO (Maybe (Key, a)) Source #

findMin. Find the minimal key, and its associated value, in the map. Nothing if the map is empty.

findMax :: JE a => JudyL a -> IO (Maybe (Key, a)) Source #

findMax. Find the maximal key, and its associated value, in the map. Nothing if the map is empty.

Conversion

keys :: JudyImmutable a -> IO [Key] Source #

Return all keys of the map in ascending order. It is important that this not be interleaved with updates, so we take a JudyImmutable, which can only be obtained with freeze or unsafeFreeze (if you are sure you know what you are doing).

elems :: JE a => JudyImmutable a -> IO [a] Source #

Return all elems of the map in ascending order.

toList :: JE a => JudyImmutable a -> IO [(Key, a)] Source #

Return keys and values of the map in ascending order.

Freezing

freeze :: (Show a, JE a) => JudyL a -> IO (JudyImmutable a) Source #

Makes a copy of a Judy array and packs it into an immutable wrapper.

unsafeFreeze :: JE a => JudyL a -> IO (JudyImmutable a) Source #

Unsafely accesses a judy array. If you never try to update it again, this may be safe, and save some memory. Caveat emptor.

Judy-storable types

class JE a where Source #

Class of things that can be stored in the JudyL array. You need to be able to convert the structure to a Word value, or a word-sized pointer.

Note: that it is possible to convert any Haskell value into a JE-type, via a StablePtr. This allocates an entry in the runtime's stable pointer table, giving you a pointer that may be passed to C, and that when dereferenced in Haskell will yield the original Haskell value. See the source for an example of this with strict bytestrings.

Minimal complete definition

toWord, fromWord

Methods

toWord :: a -> IO Word Source #

Convert the Haskell value to a word-sized type that may be stored in a JudyL

fromWord :: Word -> IO a Source #

Reconstruct the Haskell value from the word-sized type.