hyperloglogplus: Approximate cardinality estimation using constant space

[ bsd3, haskell, library ] [ Propose Tags ]

HyperLogLog++ with MinHash for efficient cardinality and intersection estimation using constant space.

See original AdRoll paper for details: http://tech.adroll.com/media/hllminhash.pdf


[Skip to Readme]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1.0.0
Dependencies base (>=4.7 && <5), bits (>=0.2 && <1), containers (>=0.5 && <0.6), murmur-hash (>=0.1 && <0.2), semigroups (>=0.18 && <1), vector (>=0.11 && <0.12) [details]
License BSD-3-Clause
Copyright Copyright: (c) 2016 Eugene Zhulenev
Author Eugene Zhulenev
Maintainer eugene.zhulenev@gmail.com
Category Haskell
Home page https://github.com/ezhulenev/hyperloglogplus#readme
Source repo head: git clone https://github.com/ezhulenev/hyperloglogplus
Uploaded by ezhulenev at 2016-07-05T18:48:26Z
Distributions
Reverse Dependencies 1 direct, 0 indirect [details]
Downloads 921 total (3 in the last 30 days)
Rating 2.0 (votes: 1) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2016-07-05 [all 1 reports]

Readme for hyperloglogplus-0.1.0.0

[back to package description]

HyperLogLogPlus

Build Status

Haskell implementation of HyperLogLog++ with MinHash for efficient cardinality and intersection estimation using constant space.

See original AdRoll paper for details: HyperLogLog and MinHash

-- Example:
:set -XDataKinds
:load Data.HyperLogLogPlus

type HLL = HyperLogLogPlus 12 8192

mempty :: HLL

size (foldr insert mempty [1 .. 75000] :: HLL)

size $ (foldr insert mempty [1 .. 5000] ::  HLL) <> (foldr insert mempty [3000 .. 10000] :: HLL)

intersection $ [ (foldr insert mempty [1 .. 15000] ::  HLL)
               , (foldr insert mempty [12000 .. 20000] :: HLL) ]

Testing

stack test