|
Data.BloomFilter.Easy | Portability | portable | Stability | unstable | Maintainer | Bryan O'Sullivan <bos@serpentine.com> |
|
|
|
|
|
Description |
An easy-to-use Bloom filter interface.
|
|
Synopsis |
|
|
|
|
Easy creation and querying
|
|
|
An immutable Bloom filter, suitable for querying from pure code.
| Instances | |
|
|
|
:: Hashable a | | => Double | desired false positive rate (0 < e < 1)
| -> [a] | values to populate with
| -> Bloom a | | Create a Bloom filter with the given false positive rate and
members. The hash functions used are computed by the cheapHashes
function from the Data.BloomFilter.Hash module.
|
|
|
|
Query an immutable Bloom filter for membership. If the value is
present, return True. If the value is not present, there is
still some possibility that True will be returned.
|
|
|
Query an immutable Bloom filter for non-membership. If the value
is present, return False. If the value is not present, there
is still some possibility that False will be returned.
|
|
|
Return the size of an immutable Bloom filter, in bits.
|
|
Example: a spell checker
|
|
This example reads a dictionary file containing one word per line,
constructs a Bloom filter with a 1% false positive rate, and
spellchecks its standard input. Like the Unix spell command, it
prints each word that it does not recognize.
import Data.BloomFilter.Easy (easyList, elemB)
main = do
filt <- (easyList 0.01 . words) `fmap` readFile "usrsharedictwords"
let check word | elemB word filt = ""
| otherwise = word ++ "\n"
interact (concat . map check . lines)
|
|
Useful defaults for creation
|
|
|
:: Int | expected maximum capacity
| -> Double | desired false positive rate (0 < e < 1)
| -> (Int, Int) | | Suggest a good combination of filter size and number of hash
functions for a Bloom filter, based on its expected maximum
capacity and a desired false positive rate.
The false positive rate is the rate at which queries against the
filter should return True when an element is not actually
present. It should be a fraction between 0 and 1, so a 1% false
positive rate is represented by 0.01.
|
|
|
Produced by Haddock version 2.4.2 |