Further Reading

Swinscow (1983) deals with the underlying assumptions of statistics and all the basic standard tests. It is an easy to understand text, especially in this complex and jargon-laden field. However, the book is aimed at medical statistics and it takes some imagination to make the examples correspond to lingustic problems.

Other books which are more linguistics oriented (although not neccessarily corpus-oriented) are Kenny (1982), and Woods, Fletcher and Hughes (1986).

On specific issues, Church et al (1991) discuss mutual information in detail, while loglinear modelling is tackled by de Haan and van Hout (1986). Alt (1990) provides a relatively accessible introduction to factor analysis and cluster analysis.