Safe Haskell | Safe |
---|---|
Language | Haskell2010 |
Unweighted reservoir algorithm suitable for sampling from data streams of large or unknown size.
Documentation
Wrapper for the state kept by algorithm R.
Keeps the sample as an IntMap
, the number of elements seen, and the random seed.
emptyRes :: RandomGen g => g -> Res g a Source #
Creates a Res
with nothing in the sample and with the counter at zero.
:: RandomGen g | |
=> Int | The maximum number of elements to be in the sample. |
-> a | The next element to be considered |
-> Res g a | The current wrapped sample |
-> Res g a |
Jeffrey Vitter's "Algorithm R" for randomly choosing a fixed-size sample from a stream of possibly very large or unknown length.
At any point in the sampling, all streamed elements have equal probability of appearing in the sample.
Intended to be partially applied on the sample size and used with fold
-like functions.