persistent-vector: A persistent sequence based on array mapped tries

[ bsd3, data, library ] [ Propose Tags ] [ Report a vulnerability ]

This package provides persistent vectors based on array mapped tries. The implementation is based on the persistent vectors used in clojure, but in a Haskell-style API. The API is modeled after Data.Sequence from the containers library.

Technically, the element-wise operations are O(log(n)), but the underlying tree cannot be more than 7 or 8 levels deep so this is effectively constant time.

One change from the clojure implementation is that this version supports O(1) slicing, though it does cheat a little. Slices retain references to elements that cannot be indexed. These extra references (and the space they occupy) can be reclaimed by shrinking the slice. This seems like a reasonable tradeoff, and, I believe, mirrors the behavior of the vector library.

Highlights:

O(1) append element, indexing, updates, length, and slicing
Reasonably compact representation

[Skip to Readme]

Modules

[Index] [Quick Jump]

Data
- Vector
  - Data.Vector.Persistent

Downloads

persistent-vector-0.2.0.tar.gz [browse] (Cabal source package)
Package description (as included in the package)

Maintainer's Corner

Package maintainers

TristanRavitch

For package maintainers and hackage trustees

edit package information

Candidates

0.2.0

Versions [RSS]	0.1.0.0, 0.1.0.1, 0.1.0.3, 0.1.1, 0.2.0
Change log	ChangeLog.md
Dependencies	base (>=4 && <5), deepseq (>=1 && <1.5), semigroups (>=0.18 && <0.19), transformers (>=0.3 && <0.6) [details]
Tested with	ghc ==7.10.2, ghc ==8.0.2, ghc ==8.2.2, ghc ==8.4.4, ghc ==8.6.5, ghc ==8.8.4, ghc ==8.10.2
License	BSD-3-Clause
Author	Tristan Ravitch
Maintainer	tristan@ravit.ch
Category	Data
Home page	https://github.com/travitch/persistent-vector
Bug tracker	https://github.com/travitch/persistent-vector/issues
Source repo	head: git clone git://github.com/travitch/persistent-vector.git
Uploaded	by TristanRavitch at 2020-10-29T06:21:47Z
Distributions
Reverse Dependencies	3 direct, 47 indirect [details]
Downloads	3623 total (5 in the last 30 days)
Rating	2.0 (votes: 1) [estimated by Bayesian average]
Your Rating	λ λ λ
Status	Docs available [build log] Last success reported on 2020-10-29 [all 1 reports]

Readme for persistent-vector-0.2.0

[back to package description]

Persistent Vector

A library providing persistent (purely functional) vectors for Haskell based on array mapped tries.

Description

These persistent vectors are modeled on the persistent vector used by clojure, with an API modeled after Data.Sequence from the containers library. This data structure is spine strict and is not useful for incremental consumption. If you need that, stick to lists. It is still lazy in the elements.

While per-element operations are O(log(n)), the internal tree can never be more than 7 or 8 deep. Thus, they are effectively constant time.

This implementation adds O(1) slicing support for vectors that I do not believe clojure supports. The implementation cheats, though, and slices can retain references to objects that cannot be indexed.

Performance

Performance is an important consideration for a data structure like this. The package contains a criterion benchmark suite that attempts to compare the performance of persistent vectors against a variety of existing persistent data structures. As an overview of the results I have observed:

Traversing and building lists is faster than the same operations with persistent vectors.
(Strict) left folds over persistent vectors are faster than left folds over Sequences. Right folds over Sequences are faster than right folds over vectors.
Indexing persistent vectors is faster than indexing sequences and IntMaps (and, of course, lists).
Appending to vectors is slightly faster than appending to a Sequence. It is much faster than appending to an IntMap.
Updating an element at an index in a vector is slower than updating an index in a Sequence (but still faster than an IntMap).

Overall, it seems like persistent vectors are efficient at most tasks. If you only need a (strict) left fold, they are efficient for traversal. Indexing and construction are very fast, but Sequences are superior for element-wise updates.

Implementation

TODO

More of the Data.Sequence API
More efficient Eq and Ord instances. This is tricky in the presence of slicing. There are faster implementations for unsliced inputs.
Implement something to make parallel reductions simple (maybe something like vector-strategies)
Implement cons. Cons can use the space that is hidden by the offset cheaply. It can also make a variant of pushTail (pushHead) that allocates fragments of preceeding sub-trees. Each cons call will modify the offset of its result vector.

Key	Shortcut
s	Open this search box
esc	Close this search box
↓,ctrl + j	Move down in search results
↑,ctrl + k	Move up in search results
↵	Go to active search result