Safe Haskell | Safe |
---|---|
Language | Haskell2010 |
Unweighted reservoir algorithm suitable for sampling from data streams of large or unknown size.
Documentation
Wrapper for the state kept by algorithm R.
Keeps the sample as an IntMap
, the number of elements seen, and the random seed.
emptyRes :: RandomGen g => g -> Res g a Source #
Creates a Res
with nothing in the sample and with the counter at zero.
:: RandomGen g | |
=> Int | The maximum number of elements to be in the sample. |
-> a | The next element to be considered |
-> Res g a | The current wrapped sample |
-> Res g a |
Jeffrey Vitter's "Algorithm R". Given the elements in the existing sample and a new element,
every element has a equal probability of being selected.
Intended to be partially applied on the sample size and used with fold
-like functions to sample
large streams of data.