vp-tree-0.1.0.1: Vantage Point Trees
Safe HaskellNone
LanguageHaskell2010

Data.VPTree

Description

This library provides an implementation of Vantage Point Trees [1], a data structure useful for indexing data points that exist in some metric space.

The current implementation is not particolarly optimized and assumes the data resides entirely in memory but it seems to work decently well for index sizes in the 10's of thousands.

Usage

  • build : construct an index from a dataset and a distance function
  • range : find points in the index that lie within a given distance from the query

Additionally, small trees can be rendered to screen with draw for debugging purposes.

References

1) P. N. Yianilos - Data structures and algorithms for nearest neighbor search in general metric spaces - http://web.cs.iastate.edu/~honavar/nndatastructures.pdf

Synopsis

Documentation

data VPTree d a Source #

Vantage point trees

Instances

Instances details
Foldable (VPTree d) Source # 
Instance details

Defined in Data.VPTree.Internal

Methods

fold :: Monoid m => VPTree d m -> m #

foldMap :: Monoid m => (a -> m) -> VPTree d a -> m #

foldMap' :: Monoid m => (a -> m) -> VPTree d a -> m #

foldr :: (a -> b -> b) -> b -> VPTree d a -> b #

foldr' :: (a -> b -> b) -> b -> VPTree d a -> b #

foldl :: (b -> a -> b) -> b -> VPTree d a -> b #

foldl' :: (b -> a -> b) -> b -> VPTree d a -> b #

foldr1 :: (a -> a -> a) -> VPTree d a -> a #

foldl1 :: (a -> a -> a) -> VPTree d a -> a #

toList :: VPTree d a -> [a] #

null :: VPTree d a -> Bool #

length :: VPTree d a -> Int #

elem :: Eq a => a -> VPTree d a -> Bool #

maximum :: Ord a => VPTree d a -> a #

minimum :: Ord a => VPTree d a -> a #

sum :: Num a => VPTree d a -> a #

product :: Num a => VPTree d a -> a #

(Eq d, Eq a) => Eq (VPTree d a) Source # 
Instance details

Defined in Data.VPTree.Internal

Methods

(==) :: VPTree d a -> VPTree d a -> Bool #

(/=) :: VPTree d a -> VPTree d a -> Bool #

(Show d, Show a) => Show (VPTree d a) Source # 
Instance details

Defined in Data.VPTree.Internal

Methods

showsPrec :: Int -> VPTree d a -> ShowS #

show :: VPTree d a -> String #

showList :: [VPTree d a] -> ShowS #

Generic (VPTree d a) Source # 
Instance details

Defined in Data.VPTree.Internal

Associated Types

type Rep (VPTree d a) :: Type -> Type #

Methods

from :: VPTree d a -> Rep (VPTree d a) x #

to :: Rep (VPTree d a) x -> VPTree d a #

(NFData d, NFData a) => NFData (VPTree d a) Source # 
Instance details

Defined in Data.VPTree.Internal

Methods

rnf :: VPTree d a -> () #

type Rep (VPTree d a) Source # 
Instance details

Defined in Data.VPTree.Internal

type Rep (VPTree d a) = D1 ('MetaData "VPTree" "Data.VPTree.Internal" "vp-tree-0.1.0.1-EcwP6uYkoBw2ZyJEMHHmtu" 'False) (C1 ('MetaCons "VPT" 'PrefixI 'True) (S1 ('MetaSel ('Just "vpTree") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedLazy) (Rec0 (VT d a)) :*: S1 ('MetaSel ('Just "vptDistFun") 'NoSourceUnpackedness 'NoSourceStrictness 'DecidedLazy) (Rec0 (a -> a -> d))))

Construction

build Source #

Arguments

:: (RealFrac p, Floating d, Ord d, Eq a) 
=> (a -> a -> d)

distance function

-> p

proportion of remaining dataset to sample at each level, \(0 < p <= 1 \)

-> Vector a

dataset used for constructing the index

-> VPTree d a 

Build a VPTree

The supplied distance function d must satisfy the definition of a metric, i.e.

  • identity of indiscernible elements : \( d(x, y) = 0 \leftrightarrow x \equiv y \)
  • symmetry : \( d(x, y) = d(y, x) \)
  • triangle inequality : \( d(x, y) + d(y, z) >= d(x, z) \)

The current implementation makes multiple passes over the whole dataset, which is why the entire indexing dataset must be present in memory (packed as a Vector).

Implementation detail : construction of a VP-tree requires a randomized algorithm, but we run that in the ST monad so the result is pure.

Query

range Source #

Arguments

:: (Num p, Ord p) 
=> VPTree p a 
-> p

proximity threshold

-> a

query point

-> [(p, a)] 

Range query : find all points in the tree closer to the query point than a given threshold

Utilities

Rendering trees

draw :: (Show a, PrintfArg d) => VPTree d a -> IO () Source #

Render a tree to stdout

Useful for debugging

This should be called only for small trees, otherwise the printed result quickly overflows the screen and becomes hard to read.

NB : prints distance information rounded to two decimal digits