bloomfilter-2.0.1.2: Pure and impure Bloom Filter implementations.
CopyrightBryan O'Sullivan
LicenseBSD3
MaintainerBryan O'Sullivan <bos@serpentine.com>
Stabilityunstable
Portabilityportable
Safe HaskellSafe-Inferred
LanguageHaskell2010

Data.BloomFilter.Easy

Description

An easy-to-use Bloom filter interface.

Synopsis

Easy creation and querying

data Bloom a Source #

An immutable Bloom filter, suitable for querying from pure code.

Instances

Instances details
Show (Bloom a) Source # 
Instance details

Defined in Data.BloomFilter

Methods

showsPrec :: Int -> Bloom a -> ShowS #

show :: Bloom a -> String #

showList :: [Bloom a] -> ShowS #

NFData (Bloom a) Source # 
Instance details

Defined in Data.BloomFilter

Methods

rnf :: Bloom a -> () #

easyList Source #

Arguments

:: Hashable a 
=> Double

desired false positive rate (0 < e < 1)

-> [a]

values to populate with

-> Bloom a 

Create a Bloom filter with the given false positive rate and members. The hash functions used are computed by the cheapHashes function from the Hash module.

elem :: a -> Bloom a -> Bool Source #

Query an immutable Bloom filter for membership. If the value is present, return True. If the value is not present, there is still some possibility that True will be returned.

notElem :: a -> Bloom a -> Bool Source #

Query an immutable Bloom filter for non-membership. If the value is present, return False. If the value is not present, there is still some possibility that False will be returned.

length :: Bloom a -> Int Source #

Return the size of an immutable Bloom filter, in bits.

Example: a spell checker

This example reads a dictionary file containing one word per line, constructs a Bloom filter with a 1% false positive rate, and spellchecks its standard input. Like the Unix spell command, it prints each word that it does not recognize.

import Data.BloomFilter.Easy (easyList, elemB)

main = do
  filt <- (easyList 0.01 . words) `fmap` readFile "usrsharedictwords"
  let check word | elemB word filt = ""
                 | otherwise         = word ++ "\n"
  interact (concat . map check . lines)
 

Useful defaults for creation

safeSuggestSizing Source #

Arguments

:: Int

expected maximum capacity

-> Double

desired false positive rate (0 < e < 1)

-> Either String (Int, Int) 

Suggest a good combination of filter size and number of hash functions for a Bloom filter, based on its expected maximum capacity and a desired false positive rate.

The false positive rate is the rate at which queries against the filter should return True when an element is not actually present. It should be a fraction between 0 and 1, so a 1% false positive rate is represented by 0.01.

suggestSizing Source #

Arguments

:: Int

expected maximum capacity

-> Double

desired false positive rate (0 < e < 1)

-> (Int, Int) 

Behaves as safeSuggestSizing, but calls error if given invalid or out-of-range inputs.