PrimitiveArray-0.10.1.0: Efficient multidimensional arrays

Data.PrimitiveArray.Sparse.IntBinSearch

Description

This solution to holding a sparse set of elements for dynamic programming. The underlying representation requires O (log n) access time for each read or write, where n is the number of elements to be stored. It uses an experimental "bucketing" system to provide better lower and upper bounds than otherwise possible.

TODO ADPfusion / FillTyLvl handles actually filling the tables. In case all BigOrder tables are dense and of the same dimensional extent, we are fine. However if at least one table is dense, while others are sparse, we will have write to nothing, which should not crash. In case of all-sparse tables for a BigOrder, we have to calculate the union of all indices. This all is currently not happening...

This version requires working fromLinearIndex but is potentially faster.

Synopsis

Documentation

data Sparse w v sh e Source #

This is a sparse matrix, where only a subset of indices have data associated.

Constructors

 Sparse FieldssparseUpperBound :: !(LimitType sh)The upper bound for the DP matrix. Not the upper bound of indexes in use, but the theoretical upper bound.sparseData :: !(v e)Vector with actually existing data.sparseIndices :: !(Vector Int)Linearly encoded sparse indicesmanhattanStart :: !(Vector Int)Provides left/right boundaries into sparseIndices to speed up index search. Should be one larger than the largest index to look up, to always provides a good excluded bound.

Instances

Instances details

type Unboxed w sh e = Sparse w Vector sh e Source #

type Storable w sh e = Sparse w Vector sh e Source #

type Boxed w sh e = Sparse w Vector sh e Source #

Helper functions.

manhattanIndex :: (SparseBucket sh, Index sh) => LimitType sh -> Vector Int -> Vector Int -> sh -> Maybe Int Source #

Find the index with manhattan helper

TODO consider using binary search instead of a linear scan here! e.g.: k = VAS.binarySearchByBounds (==)

NOTE running times with 100x100 DP problem NeedlemanWunsch full findIndex of sixs: 0,050,000 cells/sec using manhattan buckets, findIndex: 5,000,000 cells/sec using binarySearch on slices: 11,000,000 cells/sec

On a 1000x1000 DP NeedlemanWunsch problem, binary search on slices is at 6,500,000 cells/sec.

mergeIndexVectors :: (Eq sh, Ord sh, Vector w sh) => w sh -> w sh -> w sh Source #

Given two index vectors of the same shape, will return the correctly ordered vector of the union of indices.

TODO This requires that Ord (Shape O) uses the Down instance of Ord! We need to fix this in the Index modules.

TODO Rewrite to allow fusion without intermediate vectors using uncons. This will make it possible to chain applications. stream should be fine for this.