Gower's General Similarity Coefficient s where: s It should be noted that the effect of the denominator S
_{ijk }
for ordinal and continuous variables as follows: s where: r For continuous variables s
For a binary variable (or dichotomous character), Gower defines the component of similarity and the weight according to the table (right), where + denotes that attribute k is "present" and - denotes that attribute k is "absent".Thus s If all your variables are binary, then Gower's General Similarity Coefficient is equivalent to Jaccard's Similarity Coefficient A/(A+B+C) since the negative matches scored in cell D are ignored.
_{ijk
}for nominal variables is 1 if x_{ik }= x_{jk}, or 0 if x_{ik }¹ x_{jk}. Thus s_{ijk}
= 1 if cases i and j have the same "state" for attribute k, or 0 if they have different "states", and w_{ijk }= 1 if both cases have observed states for attribute k.
_{ijk }for the comparison on the kth variable is usually 1 or 0. However, if you
assign differential weights to your variables in ClustanGraphics, then w_{ijk }
is either the weight of the kth variable or 0, depending upon whether the comparison is valid or not. This allows larger weights to be given to
important variables, or for another type of external scaling of the variables to be specified.If the weight of any variable is zero, then the variable is effectively ignored for the calculation of proximities. Such variables are "masked" for clustering, but available for cluster profiling, to assist in the interpretation of a resulting cluster analysis.
However, the clustering options available using Gower are restricted to those applicable to Our implementation of Gower's General Similarity Coefficient is another example of the great flexibilty provided in Clustan software. Mixed data types frequently occur in social surveys and databases, but you are unlikely to find that other software for cluster analysis or neural networks adequately caters for such practical diversity. Gower's General Similarity Coefficient has been available in Clustan since 1984, and in ClustanGraphics since release 5 in 2001. A worked example of Gower's coefficient with psychiatric data is given here.
To order ClustanGraphics on-line click |