k-Means Analysis 

Reading Data
Proximities
Hierarchical
Constraints
k-Means
FocalPoint
Cluster Keys
Profiles
Exemplars
Classify Cases
ClustanPCA
Scatterplots
Auto Script
Wizard
Home
About Clustan
Cluster Analysis
Applications
ClustanGraphics
User Support
Clustan/PC
Orders
What's New
White Papers
Contact Us

ClustanGraphics now offers two k-means procedures.

Cluster k-Means has been optimized for speed, and is suitable for very large data sets.  For example, we have classified a million cases using a standard PC. It will handle mixed data types that can contain missing values, contiguity constraints and allow for differential case weights and differential variable weights.  The following criterion functions can be optimized:

where rho is the Pearson product-moment correlation between a case and a cluster mean.  Euclidean Sum of Squares is the recommended criterion function because an exact relocation test has been implemented and hence k-means is guaranteed to converge if allowed sufficient iterations.  For example, see our data mining case study of a million cases, clustered in minutes on a PC, or read our k-means technical critique.

If you are involved in data mining or analyzing large social surveys, remember that our k-means analysis and hierarchical cluster analysis can handle different types of variables, such as occur in survey questionnaires and database records.  You are not likely to find similar flexibility in other clustering or neural network software.

FocalPoint Clustering performs a number of random trials to optimize ESS and finds several "top solutions" from the same data.  It was developed specifically for use in market segmentation, and offers several unique features.  There is a separate FocalPoint Clustering User Guide.

k-Means Tree produces a tree that summarizes a k-means cluster model, either in full or for k clusters.

See our technical critique of k-means clustering.  We compare our implementation with other software and show how ours is guaranteed to converge, why deletion of outliers is important, and explain why different starting conditions can produce different final classifications, some of which may be sub-optimal.  Details here.

To find out more, ORDER ClustanGraphics on-line now.