Copyright	(c) 2015 Kai Zhang
License	MIT
Maintainer	kai@kzhang.org
Stability	experimental
Portability	portable
Safe Haskell	None
Language	Haskell2010

AI.Clustering.KMeans

Contents

Initialization methods
Useful functions
References

Description

Kmeans clustering

Synopsis

Documentation

data KMeans Source

Results from running kmeans

Constructors

KMeans
Fields _clusters :: Vector Int A vector of integers (0 ~ k-1) indicating the cluster to which each point is allocated. _centers :: Matrix Double A matrix of cluster centers.

Instances

Show KMeans

kmeans :: (PrimMonad m, Matrix mat Vector Double) => Gen (PrimState m) -> Method -> Int -> mat Vector Double -> m KMeans Source

Perform K-means clustering

kmeansBy Source

Arguments

:: (PrimMonad m, Vector v a)
=> Gen (PrimState m)
-> Method
-> Int	number of clusters
-> v a	data stores in rows
-> (a -> Vector Double)
-> m KMeans

K-means algorithm

kmeansWith Source

Arguments

:: Vector v a
=> Matrix Double	initial set of k centroids
-> v a	each row represents a point
-> (a -> Vector Double)
-> KMeans

K-means algorithm

Initialization methods

data Method Source

Different initialization methods

Constructors

Forgy	The Forgy method randomly chooses k unique observations from the data set and uses these as the initial means.
KMeansPP	K-means++ algorithm.

Useful functions

decode :: KMeans -> [a] -> [[a]] Source

Assign data to clusters based on KMeans result

withinSS :: KMeans -> Matrix Double -> [Double] Source

Compute within-cluster sum of squares

References

Arthur, D. and Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics Philadelphia, PA, USA. pp. 1027–1035.