Constrained Clustering
About Clustan
Cluster Analysis
User Support
What's New
White Papers
Contact Us

For certain clustering applications it may be necessary to restrict the range of possible cluster solutions.  Particularly when clustering spatial locations, pixels in an image or zones on a map, we may require each cluster to be an un-fragmented contiguous spatial area.  Applications include determining traffic zones for modelling transportation policies, defining marketing territories, socio-economic or legislative districts, ecological boundaries in remotely-sensed images, land use, census analysis and in sequencing contiguous genomic regions.

The following small example illustrates contiguity constraints applied to 25 cases, obtained by clicking View/Contiguities in ClustanGraphics5.  Please bear in mind that this can apply to 100,000+ units.  The first line indicates that case 1 will only be included within a cluster that also contains at least one of the cases 14, 25, 3, 23, 10, or 2.  These are the six cases which are c o n ti g u o us to case 1.  Note that cases 1, 2 and 3 are mutually contiguous (mutual contiguity is a desirable, though not mandatory, requirement).  Case 4 is most probably at a corner of a map, or otherwise remote, as it is contiguous only to case 10.

Another application of constrained clustering occurs where the cases are ordered, for example by time, stratigraphy or position in a transect or chromosome, and the classification is required to respect the ordering.  To achieve this, the contiguous neighbours of each case would be those cases (usually 2) which immediately precede and immediately follow it in the ordering.

There are no strict rules for specifying contiguities.  A case may have as many, or as few, contiguous neighbours as are deemed appropriate.  For example, a case can have no contiguous neighbours, but this is not very useful as such cases would never be clustered while contiguity constraints are in effect.  However, contiguous cases should normally be mutually contiguous, as in cases 1, 2 and 3 in the above example.

With ClustanGraphics, contiguity constraints can be read and saved with the data and cluster model.  They can also be computed from nearest neighbours, as in the following example for the Mammals case study.  The result, shown below, indicates three discontiguous clusters having no neighbours in common at the 3-cluster level.

In ClustanGraphics5, contiguity constraints can be applied to hierarchical cluster analysis (11 methods), k-means analysis and direct data clustering.  When contiguity constraints are used with direct data clustering, hierarchical cluster analysis with contiguous clusters can be obtained for images of 100,000 pixels, or more.  In most applications, contiguity constraints actually speed up the clustering process because the number of cluster fusions or relocations to be evaluated is restricted.

Data which are arranged on a grid, such as pixels in a remotely-sensed image, can be clustered subject to spatial contiguity constraints.  Our software will also compute the spatial contiguities from pixel co-ordinates.  Please ask for further details.

Constrained clustering was introduced in ClustanGraphics 4.11.

Clustan - A Class Act © 1998 Clustan Ltd