![]()
Wilks' lambda
Wilks' lambda is a statistic used in particular by Discriminant Factor Analysis as a measure of the class centers separation. When the classes are multinormal with identical means and covariance matrix, the distribution of Wilk's lambda is known, and it can therefore be used for testing the identity of the population means.
Therefore, Wilk'sd lambda plays the same role in the multivariate domain as Fisher's F for (univariate) ANOVA.
Definition of Wilks lambda
Wilks' lambda is defined as the proportion of the group inertias which is not explained by the response variable (that identifies the classes) in the classical scheme of variance decomposition. It is therefore the ratio of :
* The intra-class inertia, and
* The total inertia.
Note the difference with ANOVA's F statistic, which
is the ratio of the Explained Sum of Squares to the µResidual Sum of Squares
(unexplained).
Wilks' lambda is therefore a number between 0 and 1.
If only a small fraction of the total inertia is not explained by the existence of groups, then these groups are well separated, and their means are significantly different. Hence :
* A small (close to 0) value of Wilks' lambda means that the groups are well separated.
* A large (close to 1) value of Wilks' lambda means that the groups are poorly separated (lower image of the above illustration).
Under the following assumptions :
* All variables are normally distributed,
* Classes have identical covariance matrices,
* Classes have identical means,
the distribution of Wilks' lambda is known. It is very complex. Fortunately, various mathematical transformations change Wilks lambda into other statistics that are either :
* Chi-square distributed, or
* F distributed.
It is then possible to test the numm hypothesis according to which the class means are identical, provided that the other conditions are satisfied.
-----
Software often display only the p-value of the test statistic rather than the value of Wilks' lambda.
-----
Software sometimes display the value of Wilks' lambda for each and every individual independent variable. These values may then be regarded as measuring the discriminant power of the corresponding variable.
Wilk's lambda may also be used for variable selection in Discriminant Analysis. It is possible to build a statistic that is approximately F distributed, and which is a function of the Wilks' lambdas pertaining to :
* A given subset of variables,
* And that same subset to which a new variable has been added.
An F test is the used for identifying which new variable will most increase the group separation. This variable is the added to the model.
The conditions of applications of Wilks' test are rather stringent.
* Classes must be multinormal.
* Their covariance matrices must be identical. This condition is usually verified by Box' test.
____________________________________________
Related readings
|
|
Want to contribute to this site ? |