fuzzystrmatch-pg: Determine string similarities and distance

[ algorithms, data, library, mit, text ] [ Propose Tags ] [ Report a vulnerability ]

Haskell implementation of PostgreSQL fuzzystrmatch extension


[Skip to Readme]

Downloads

Maintainer's Corner

Package maintainers

For package maintainers and hackage trustees

Candidates

  • No Candidates
Versions [RSS] 0.1.0.0
Change log CHANGELOG.md
Dependencies base (>=4.9 && <4.23), text (>=1.2.2 && <2.2), vector (>=0.11 && <0.14) [details]
Tested with ghc ==9.4.8, ghc ==9.6.6, ghc ==9.8.4, ghc ==9.10.1, ghc ==9.12.2
License MIT
Author Taimoor Zaeem
Maintainer Taimoor Zaeem <taimoorzaeem@gmail.com>
Uploaded by taimoorzaeem at 2026-04-17T06:34:55Z
Category Data, Text, Algorithms
Home page https://github.com/taimoorzaeem/fuzzystrmatch-pg
Bug tracker https://github.com/taimoorzaeem/fuzzystrmatch-pg/issues
Source repo head: git clone https://github.com/taimoorzaeem/fuzzystrmatch-pg
Distributions LTSHaskell:0.1.0.0, Stackage:0.1.0.0
Downloads 5 total (1 in the last 30 days)
Rating (no votes yet) [estimated by Bayesian average]
Your Rating
  • λ
  • λ
  • λ
Status Docs uploaded by user
Build status unknown [no reports yet]

Readme for fuzzystrmatch-pg-0.1.0.0

[back to package description]

fuzzystrmatch-pg

Build

Haskell implementation of PostgreSQL extension/module fuzzystrmatch.

Roadmap

  • Levenshtein - Implement Levenshtein distance functions

Quick Start

import Data.FuzzyStrMatch (levenshtein)
import Data.Text

kitten  = "kitten"  :: Text
sitting = "sitting" :: Text

ghci> levenshtein kitten sitting
3

ghci> levenshteinLessEqual kitten sitting 3
Just 3

-- Bounded version which exits early, hence much faster
ghci> levenshteinLessEqual kitten sitting 2
Nothing

Benchmark

Benchmarking with criterion library:

let source = "aaaaaaaaaaaaaaaaaaaaaaaaaaaa"
    target = "abababababababababababababababababababababababababababab"
    maxD   = 3

defaultMain
  [
    bench "levenshtein" $ nf (levenshtein source) target
  , bench "levenshteinLessEqual" $ nf (levenshteinLessEqual source target) maxD
  ]
benchmarking levenshtein
time                 217.0 μs   (214.5 μs .. 219.0 μs)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 212.9 μs   (211.4 μs .. 214.6 μs)
std dev              5.201 μs   (4.374 μs .. 6.315 μs)
variance introduced by outliers: 19% (moderately inflated)

benchmarking levenshteinLessEqual
time                 45.01 μs   (44.79 μs .. 45.29 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 45.09 μs   (44.86 μs .. 45.49 μs)
std dev              936.6 ns   (617.6 ns .. 1.526 μs)
variance introduced by outliers: 17% (moderately inflated)

We believe that the difference would be much more on longer strings.