In the above example, using the upper tail rule with the Mammals Case Study, Best Cut indicates that the 2 or
3-cluster partitions are significant departures from the distribution of fusion values. When you click OK, the tree will be shaded for the largest number of clusters; in this case, the 3-cluster section.
Upper Tail Rule
The upper tail rule takes the fusion values as a series, computes the mean and standard deviation, and a
t-statistic as the standardised deviation from the mean. It then computes the standard deviate for each fusion
value on this distribution (assumed normal), and selects the first one as "significant" if its t-value exceeds the
5% level. So the null hypothesis is that the kth fusion value comes from the normal distribution of fusion values.
Moving Average Quality Control Rule
For each fusion k, the moving average quality control rule fits a linear trend to the first k-1 fusion values and then computes an expected value for the kth fusion value from the trend.
The original paper containing these rules was published by Prof. Dick Mojena in the Computer Journal, 1977. Some further tests were completed by Mojena and Wishart, and presented at COMPSTAT 1980. Their main
conclusion was that the upper tail rule and moving average quality control rule performed creditably in tests, but a third rule - double exponential smoothing - did not do so well.
Although the paper presented at COMPSTAT 1980 reported tests using Ward's Method, in ClustanGraphics the tests are available for any hierarchical clustering fusion sequence.
Mojena, R (1977) Hierarchical grouping methods and stopping rules: an evaluation, Computer Journal, v. 20, 353-363.
Mojena, R, and Wishart, D (1980) Stopping rules for Ward's clustering method, COMPSTAT 1980
Proceedings, Physica-Verlag, 426-432.
A discussion also appears in: Wishart, D (1987), Clustan User Manual under Rules, pp 156-159.