clustering-0.2.0: High performance clustering algorithms

Copyright(c) 2015 Kai Zhang
LicenseMIT
Maintainerkai@kzhang.org
Stabilityexperimental
Portabilityportable
Safe HaskellNone
LanguageHaskell2010

AI.Clustering.KMeans

Contents

Description

Kmeans clustering

Synopsis

Documentation

data KMeans Source

Results from running kmeans

Constructors

KMeans 

Fields

_clusters :: Vector Int

A vector of integers (0 ~ k-1) indicating the cluster to which each point is allocated.

_centers :: Matrix Double

A matrix of cluster centers.

Instances

kmeans :: (PrimMonad m, Matrix mat Vector Double) => Gen (PrimState m) -> Method -> Int -> mat Vector Double -> m KMeans Source

Perform K-means clustering

kmeansBy Source

Arguments

:: (PrimMonad m, Vector v a) 
=> Gen (PrimState m) 
-> Method 
-> Int

number of clusters

-> v a

data stores in rows

-> (a -> Vector Double) 
-> m KMeans 

K-means algorithm

kmeansWith Source

Arguments

:: Vector v a 
=> Matrix Double

initial set of k centroids

-> v a

each row represents a point

-> (a -> Vector Double) 
-> KMeans 

K-means algorithm

Initialization methods

data Method Source

Different initialization methods

Constructors

Forgy

The Forgy method randomly chooses k unique observations from the data set and uses these as the initial means.

KMeansPP

K-means++ algorithm.

Useful functions

decode :: KMeans -> [a] -> [[a]] Source

Assign data to clusters based on KMeans result

withinSS :: KMeans -> Matrix Double -> [Double] Source

Compute within-cluster sum of squares

References

Arthur, D. and Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics Philadelphia, PA, USA. pp. 1027–1035.