intern-0.9.1.4: Efficient hash-consing for arbitrary data types

Copyright(c) Daan Leijen 2002 (c) Edward Kmett 2011
LicenseBSD-style
Maintainerlibraries@haskell.org
Stabilityprovisional
Portabilitynon-portable (TypeFamilies, MagicHash)
Safe HaskellNone
LanguageHaskell98

Data.Interned.IntSet

Contents

Description

An efficient implementation of integer sets.

Since many function names (but not the type name) clash with Prelude names, this module is usually imported qualified, e.g.

 import Data.IntSet (IntSet)
 import qualified Data.IntSet as IntSet

The implementation is based on big-endian patricia trees. This data structure performs especially well on binary operations like union and intersection. However, my benchmarks show that it is also (much) faster on insertions and deletions when compared to a generic size-balanced set implementation (see Data.Set).

  • Chris Okasaki and Andy Gill, "Fast Mergeable Integer Maps", Workshop on ML, September 1998, pages 77-86, http://citeseer.ist.psu.edu/okasaki98fast.html
    • D.R. Morrison, "/PATRICIA -- Practical Algorithm To Retrieve Information Coded In Alphanumeric/", Journal of the ACM, 15(4), October 1968, pages 514-534.

Many operations have a worst-case complexity of O(min(n,W)). This means that the operation can become linear in the number of elements with a maximum of W -- the number of bits in an Int (32 or 64).

Unlike the reference implementation in Data.IntSet, Data.Interned.IntSet uses hash consing to ensure that there is only ever one copy of any given IntSet in memory. This is enabled by the normal form of the PATRICIA trie.

This can mean a drastic reduction in the memory footprint of a program in exchange for much more costly set manipulation.

Synopsis

Set type

Operators

(\\) :: IntSet -> IntSet -> IntSet infixl 9 Source

O(n+m). See difference.

Query

null :: IntSet -> Bool Source

O(1). Is the set empty?

size :: IntSet -> Int Source

O(1). Cardinality of the set.

member :: Int -> IntSet -> Bool Source

O(min(n,W)). Is the value a member of the set?

notMember :: Int -> IntSet -> Bool Source

O(min(n,W)). Is the element not in the set?

isSubsetOf :: IntSet -> IntSet -> Bool Source

O(n+m). Is this a subset? (s1 isSubsetOf s2) tells whether s1 is a subset of s2.

isProperSubsetOf :: IntSet -> IntSet -> Bool Source

O(n+m). Is this a proper subset? (ie. a subset but not equal).

Construction

empty :: IntSet Source

O(1). The empty set.

singleton :: Int -> IntSet Source

O(1). A set of one element.

insert :: Int -> IntSet -> IntSet Source

O(min(n,W)). Add a value to the set. When the value is already an element of the set, it is replaced by the new one, ie. insert is left-biased.

delete :: Int -> IntSet -> IntSet Source

O(min(n,W)). Delete a value in the set. Returns the original set when the value was not present.

Combine

union :: IntSet -> IntSet -> IntSet Source

O(n+m). The union of two sets.

unions :: [IntSet] -> IntSet Source

The union of a list of sets.

difference :: IntSet -> IntSet -> IntSet Source

O(n+m). Difference between two sets.

intersection :: IntSet -> IntSet -> IntSet Source

O(n+m). The intersection of two sets.

Filter

filter :: (Int -> Bool) -> IntSet -> IntSet Source

O(n). Filter all elements that satisfy some predicate.

partition :: (Int -> Bool) -> IntSet -> (IntSet, IntSet) Source

O(n). partition the set according to some predicate.

split :: Int -> IntSet -> (IntSet, IntSet) Source

O(min(n,W)). The expression (split x set) is a pair (set1,set2) where set1 comprises the elements of set less than x and set2 comprises the elements of set greater than x.

split 3 (fromList [1..5]) == (fromList [1,2], fromList [4,5])

splitMember :: Int -> IntSet -> (IntSet, Bool, IntSet) Source

O(min(n,W)). Performs a split but also returns whether the pivot element was found in the original set.

Min/Max

findMin :: IntSet -> Int Source

O(min(n,W)). The minimal element of the set.

findMax :: IntSet -> Int Source

O(min(n,W)). The maximal element of a set.

deleteMin :: IntSet -> IntSet Source

O(min(n,W)). Delete the minimal element.

deleteMax :: IntSet -> IntSet Source

O(min(n,W)). Delete the maximal element.

deleteFindMin :: IntSet -> (Int, IntSet) Source

O(min(n,W)). Delete and find the minimal element.

deleteFindMin set = (findMin set, deleteMin set)

deleteFindMax :: IntSet -> (Int, IntSet) Source

O(min(n,W)). Delete and find the maximal element.

deleteFindMax set = (findMax set, deleteMax set)

maxView :: IntSet -> Maybe (Int, IntSet) Source

O(min(n,W)). Retrieves the maximal key of the set, and the set stripped of that element, or Nothing if passed an empty set.

minView :: IntSet -> Maybe (Int, IntSet) Source

O(min(n,W)). Retrieves the minimal key of the set, and the set stripped of that element, or Nothing if passed an empty set.

Map

map :: (Int -> Int) -> IntSet -> IntSet Source

O(n*min(n,W)). map f s is the set obtained by applying f to each element of s.

It's worth noting that the size of the result may be smaller if, for some (x,y), x /= y && f x == f y

Fold

fold :: (Int -> b -> b) -> b -> IntSet -> b Source

O(n). Fold over the elements of a set in an unspecified order.

sum set   == fold (+) 0 set
elems set == fold (:) [] set

Conversion

List

elems :: IntSet -> [Int] Source

O(n). The elements of a set. (For sets, this is equivalent to toList)

toList :: IntSet -> [Int] Source

O(n). Convert the set to a list of elements.

fromList :: [Int] -> IntSet Source

O(n*min(n,W)). Create a set from a list of integers.

Ordered list

toAscList :: IntSet -> [Int] Source

O(n). Convert the set to an ascending list of elements.

fromAscList :: [Int] -> IntSet Source

O(n). Build a set from an ascending list of elements. The precondition (input list is ascending) is not checked.

fromDistinctAscList :: [Int] -> IntSet Source

O(n). Build a set from an ascending list of distinct elements. The precondition (input list is strictly ascending) is not checked.

Debugging

showTree :: IntSet -> String Source

O(n). Show the tree that implements the set. The tree is shown in a compressed, hanging format.

showTreeWith :: Bool -> Bool -> IntSet -> String Source

O(n). The expression (showTreeWith hang wide map) shows the tree that implements the set. If hang is True, a hanging tree is shown otherwise a rotated tree is shown. If wide is True, an extra wide version is shown.