kdt-0.1.0: Fast and flexible k-d trees for various types of point queries.

Safe HaskellNone
LanguageHaskell2010

Data.KdTree.Static

Contents

Synopsis

Introduction

Let's say you have a large set of 3D points called data points, and you'd like to be able to quickly perform point queries on the data points. One example of a point query is the nearest neighbor query: given a set of data points points and a query point p, which point in points is closest to p?

We can efficiently solve the nearest neighbor query (along with many other types of point queries) if we appropriately organize the data points. One such method of organization is called the k-d tree algorithm, which is implemented in this module.

Usage

Let's say you have a list of 3D data points, and each point is of type Point3d:

data Point3d = Point3d { x :: Double
                       , y :: Double
                       , z :: Double
                       } deriving Show

We call a point's individual values axis values (i.e., x, y, and z in the case of Point3d).

In order to generate a k-d tree of Point3d's, we need to define a PointAsListFn that expresses the point's axis values as a list:

point3dAsList :: Point3d -> [Double]
point3dAsList (Point3d x y z) = [x, y, z]

Now we can build a KdTree structure from a list of data points and perform a nearest neighbor query as follows:

>>> let dataPoints = [(Point3d 0.0 0.0 0.0), (Point3d 1.0 1.0 1.0)]

>>> let kdt = buildKdTree point3dAsList dataPoints

>>> let queryPoint = Point3d 0.1 0.1 0.1

>>> nearestNeighbor kdt queryPoint
Point3d {x = 0.0, y = 0.0, z = 0.0}

Variants

Dynamic k-d trees

The KdTree structure is meant for static sets of data points. If you need to insert points into an existing k-d tree, check out Data.KdTree.Dynamic.KdTree.

k-d maps

If you need to associate additional data with each point in the tree (i.e., points are keys associated with values), check out Data.KdMap.Static.KdMap and Data.KdMap.Dynamic.KdMap for static and dynamic variants of this functionality. Please do not try to fake this functionality with a KdTree by augmenting your point type with the extra data; you're gonna have a bad time.

Advanced

Custom distance functions

You may have noticed in the previous use case that we never specified what "nearest" means for our points. By default, buildKdTree uses a Euclidean distance function that is sufficient in most cases. However, point queries are typically faster on a KdTree built with a user-specified custom distance function. Let's generate a KdTree using a custom distance function.

One idiosyncrasy about KdTree is that custom distance functions are actually specified as squared distance functions (SquaredDistanceFn). This means that your custom distance function must return the square of the actual distance between two points. This is for efficiency: regular distance functions often require expensive square root computations, whereas in our case, the squared distance works fine and doesn't require computing any square roots. Here's an example of a squared distance function for Point3d:

point3dSquaredDistance :: Point3d -> Point3d -> Double
point3dSquaredDistance (Point3d x1 y1 z1) (Point3d x2 y2 z2) =
  let dx = x1 - x2
      dy = y1 - y2
      dz = z1 - z2
  in  dx * dx + dy * dy + dz * dz

We can build a KdTree using our custom distance function as follows:

>>> let kdt = buildKdTreeWithDistFn point3dAsList point3dSquaredDistance points

Axis value types

In the above examples, we used a point type with axis values of type Double. We can in fact use axis values of any type that is an instance of the Real typeclass. This means you can use points that are composed of Doubles, Ints, Floats, and so on:

data Point2i = Point2i Int Int

point2iAsList :: Point2i -> [Int]
point2iAsList (Point2i x y) = [x, y]

kdt :: [Point2i] -> KdTree Int Point2i
kdt dataPoints = buildKdTree point2iAsList dataPoints

Reference

Types

type PointAsListFn a p = p -> [a] Source

Converts a point of type p with axis values of type a into a list of axis values [a].

type SquaredDistanceFn a p = p -> p -> a Source

Returns the squared distance between two points of type p with axis values of type a.

data KdTree a p Source

A k-d tree structure that stores points of type p with axis values of type a.

Instances

Foldable (KdTree a) 
Generic (KdTree a p) 
(NFData a, NFData p) => NFData (KdTree a p) 
type Rep (KdTree a p) 

k-d tree construction

buildKdTree Source

Arguments

:: Real a 
=> PointAsListFn a p 
-> [p]

non-empty list of data points to be stored in the k-d tree

-> KdTree a p 

Builds a KdTree from a list of data points using a default squared distance function defaultDistSqrFn.

Average complexity: O(n * log(n)) for n data points.

Worst case time complexity: O(n^2) for n data points.

Worst case space complexity: O(n) for n data points.

Throws an error if given an empty list of data points.

buildKdTreeWithDistFn :: Real a => PointAsListFn a p -> SquaredDistanceFn a p -> [p] -> KdTree a p Source

Builds a KdTree from a list of data points using a user-specified squared distance function.

Average time complexity: O(n * log(n)) for n data points.

Worst case time complexity: O(n^2) for n data points.

Worst case space complexity: O(n) for n data points.

Throws an error if given an empty list of data points.

Query

nearestNeighbor :: Real a => KdTree a p -> p -> p Source

Given a KdTree and a query point, returns the nearest point in the KdTree to the query point.

Average time complexity: O(log(n)) for n data points.

Worst case time complexity: O(n) for n data points.

pointsInRadius Source

Arguments

:: Real a 
=> KdTree a p 
-> a

radius

-> p

query point

-> [p]

list of points in tree with given radius of query point

Given a KdTree, a query point, and a radius, returns all points in the KdTree that are within the given radius of the query point.

Points are not returned in any particular order.

Worst case time complexity: O(n) for n data points and a radius that subsumes all points in the structure.

kNearestNeighbors :: Real a => KdTree a p -> Int -> p -> [p] Source

Given a KdTree, a query point, and a number k, returns the k nearest points in the KdTree to the query point.

Neighbors are returned in order of increasing distance from query point.

Average time complexity: log(k) * log(n) for k nearest neighbors on a structure with n data points.

Worst case time complexity: n * log(k) for k nearest neighbors on a structure with n data points.

points :: KdTree a p -> [p] Source

Returns a list of all the points in the KdTree.

Time complexity: O(n) for n data points.

pointsInRange Source

Arguments

:: Real a 
=> KdTree a p 
-> p

lower bounds of range

-> p

upper bounds of range

-> [p]

all points within given range

Finds all points in a KdTree with points within a given range, where the range is specified as a set of lower and upper bounds.

Points are not returned in any particular order.

Worst case time complexity: O(n) for n data points and a range that spans all the points.

size :: KdTree a p -> Int Source

Returns the number of elements in the KdTree.

Time complexity: O(1)

Utilities

defaultDistSqrFn :: Num a => PointAsListFn a p -> SquaredDistanceFn a p Source

A default implementation of squared distance given two points and a PointAsListFn.