perfect-hash-generator: Perfect minimal hashing implementation in native Haskell

[ apache, data-structures, embedded, library, program ] [ Propose Tags ]

A perfect hash function for a set S is a hash function that maps distinct elements in S to a set of integers, with no collisions. A minimal perfect hash function is a perfect hash function that maps n keys to n consecutive integers, e.g. the numbers from 0 to n-1.

In contrast with the PerfectHash package, which is a binding to a C-based library, this package is a fully-native Haskell implementation.

It is intended primarily for generating C code for embedded applications (compare to gperf). The output of this tool is a pair of arrays that can be included in generated C code for allocation-free hash tables.

Though conceivably this data structure could be used directly in Haskell applications as a read-only hash table, it is not recommened, as lookups are about 10x slower than HashMap.

This implementation was adapted from Steve Hanov's Blog.

Usage

The library is written generically to hash both strings and raw integers according to the FNV-1a algorithm. Integers are split by octets before hashing.

import Data.PerfectHash.Construction (createMinimalPerfectHash)
import qualified Data.Map as Map

tuples = [
   (1000, 1)
 , (5555, 2)
 , (9876, 3)
 ]

lookup_table = createMinimalPerfectHash $ Map.fromList tuples

Generation of C code based on the arrays in lookup_table is left as an exercise to the reader. Algorithm documentation in the Data.PerfectHash.Hashing and Data.PerfectHash.Lookup modules will be helpful.

Demo

See the hash-perfectly-strings-demo and hash-perfectly-ints-demo, as well as the test suite, for working examples.

$ stack build
$ stack exec hash-perfectly-strings-demo

Modules

[Index] [Quick Jump]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1.0.0, 0.1.0.1, 0.1.0.2, 0.1.0.3, 0.1.0.4, 0.2.0.0, 0.2.0.1, 0.2.0.2, 0.2.0.3, 0.2.0.4, 0.2.0.5, 0.2.0.6, 1.0.0 (info)
Change log changelog.md
Dependencies base (>=4.5 && <5), binary, bytestring, containers, data-default, data-ordlist, directory, filepath, hashable, optparse-applicative, perfect-hash-generator, random, sorted-list, text, unordered-containers, vector [details]
License Apache-2.0
Author Karl Ostmo <kostmo@gmail.com>
Maintainer Karl Ostmo <kostmo@gmail.com>
Category Data Structures, Embedded
Home page https://github.com/kostmo/perfect-hash-generator#readme
Bug tracker https://github.com/kostmo/perfect-hash-generator/issues
Source repo head: git clone https://github.com/kostmo/perfect-hash-generator
Uploaded by kostmo at 2022-06-27T16:46:09Z
Distributions LTSHaskell:1.0.0, NixOS:1.0.0, Stackage:1.0.0
Executables hash-perfectly-strings-demo, hash-perfectly-ints-demo
Downloads 4696 total (26 in the last 30 days)
Rating 2.25 (votes: 2) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs available [build log]
Last success reported on 2022-06-27 [all 1 reports]