edit-distance-linear: Efficient implementation of the Levenshtein edit distance in linear memory.

[ algorithms, bsd3, library ] [ Propose Tags ] [ Report a vulnerability ]

Please see the README on GitHub at https://github.com/0xd34df00d/edit-distance-linear#readme

[Skip to Readme]

Modules

[Index] [Quick Jump]

Text
- EditDistance
  - Text.EditDistance.Linear

Flags

Manual Flags

Name	Description	Default
llvm	Use the LLVM code generator (strongly recommended)	Disabled
with-executable	Build the test executable (a fast and dirty way of benchmarking, for the package development only)	Disabled

Use -f <flag> to enable a flag, or -f -<flag> to disable that flag. More info

Downloads

edit-distance-linear-0.2.0.2.tar.gz [browse] (Cabal source package)
Package description (as included in the package)

Maintainer's Corner

Package maintainers

0xd34df00d

For package maintainers and hackage trustees

edit package information

Candidates

No Candidates

Versions [RSS]	0.2.0.1, 0.2.0.2
Change log	ChangeLog.md
Dependencies	array, base (>=4.7 && <5), bytestring, edit-distance-linear [details]
License	BSD-3-Clause
Copyright	2019 Georg Rudoy
Author	Georg Rudoy
Maintainer	0xd34df00d@gmail.com
Uploaded	by 0xd34df00d at 2019-12-08T22:44:39Z
Category	Algorithms
Home page	https://github.com/0xd34df00d/edit-distance-linear#readme
Bug tracker	https://github.com/0xd34df00d/edit-distance-linear/issues
Source repo	head: git clone https://github.com/0xd34df00d/edit-distance-linear
Distributions	NixOS:0.2.0.2
Executables	edit-distance-linear-exe
Downloads	883 total (5 in the last 30 days)
Rating	(no votes yet) [estimated by Bayesian average]
Your Rating	λ λ λ
Status	Docs available [build log] Last success reported on 2019-12-08 [all 1 reports]

Readme for edit-distance-linear-0.2.0.2

[back to package description]

edit-distance-linear

The pure Haskell implementation of the Levenshtein edit distance, with linear space complexity.

Comparison

There are already several other existing implementations, but the goals and design decisions vary. In particular, this package is intended to be used to:

compare long strings (think tens of thousands of characters), driving the implementation to live in the ST monad and aim at linear space complexity to lower GC pressure;
not care about Unicode, thus accepting ByteStrings and comparing them byte-by-byte rather than character-by-character (or glyph-by-glyph, or whatever is the right notion of an edit for Unicode).

Among the alternatives:

text-metrics — uses a similar algorithm, but cares about Unicode, making it 4-5 times slower.
edit-distance — uses a very different algorithm (which we might implement here one day with huge potential benefits), which tends to consume more memory (I'm not up for estimating its space asymptotics, though).