In our Cars case study, a distance matrix was computed from standardized variables. It has the following values:
We obtain a hierarchical cluster analysis on this distance matrix using Increase in Sum of Squares (Ward's Method) which produced the following tree:
ClustanGraphics can display a shaded representation of the proximity matrix, with the rows re-ordered to correspond to the tree:
However, this proximity matrix and the corresponding tree may not be optimally ordered. There are many ways to display a tree,
because the order of the cases within any of the clusters can be reversed. Thus there are 2 ClustanGraphics can examine these different orderings and find a new order which concentrates the strongest proximities close to the diagonal. The objective is to maximize the rank order of the proximities in every row, such that they correspond as closely as possible to the "perfect" rank order", as follows:
The result after 4 iterations of the serial re-ordering procedure in ClustanGraphics is shown above. Note that the heaviest shading is now much more concentrated close to the diagonal. The tree from Ward's Method which corresponds to this optimal order has now been re-arranged as follows: The re-ordered tree can now be more easily interpreted than the original tree. The algorithm used to optimally re-order a hierarchical cluster analysis was first presented at the Second World Conference of the IASC, Pasadena 1997 and was been published in: Wishart, D. (1997), ClustanGraphics: Interactive Graphics for Cluster Analysis, Computing Science and Statistics, 29, 48-51. A revised version was later presented at GfKl '98, and published in: Wishart, D. (1999), ClustanGraphics3: Interactive Graphics for Cluster Analysis, in Classification in the Information Age, Gaul, W. and Locarek-Junge, H. (Eds), Springer 1999, 268-275. The paper can be downloaded here as a zip file, by kind permission of Springer-Verlag (70 Kb). |