This is a worked example of Gower's Similarity Coefficient, taken from Cluster Analysis,
Third Edition, by Brian S. Everitt, Arnold, London, 45-46.
Everitt illustrates the coefficient using the following data for five psychiatrically ill patients:
Case |
Weight |
Anxiety |
Depression |
Hallucination |
Age |
Patient1 |
120 |
1 |
1 |
1 |
1 |
Patient2 |
150 |
2 |
2 |
1 |
2 |
Patient3 |
110 |
3 |
2 |
2 |
3 |
Patient4 |
145 |
1 |
1 |
2 |
3 |
Patient5 |
120 |
1 |
1 |
2 |
1 |
The above data can be easily read by ClustanGraphics. Simply select the values in the
table and copy them to an Excel file, then click File/New/Data in ClustanGraphics and choose Excel Spreadsheet as the file format to read the file and the headings and case labels.
Next, select Edit/Data Types and change the type specifications of Anxiety and Age to nominal. Note here that this is possibly an incorrect definition, since these two variables appear to be ordinal;
however, we shall specify nominal to be consistent with the type definitions in Everitt's example. Click OK and accept the changed data type specifications. Note that it is not necessary to transform the Weight
variable because transformation by range is standard in Gower's coefficient. Now select Prox/Compute, noting that ClustanGraphics has recognized that the variables comprise mixed data
types. Select Gower's Coefficient from the list of similarity and dissimilarity coefficients available for mixed data types. When you press OK the proximity matrix will be computed. You may also wish at this stage to cluster the data
hierarchically. To check the values for Gower's coefficient click View/Prox. There are unfortunately two errors in the similarity matrix shown on page 46 of Everitt's book - coefficients s_{25} and s_{
45} are wrongly reported. You can easily check by hand that the correct Gower similarity coefficients have been computed by ClustanGraphics. |