Glossary

Click on the word you looked up to go back to the page you were previously at.

collocations: Collocations are characteristic, co-occurence patterns of words. For example: "Christmas" may collocate with "tree", "angel", and "presents".

cross-tabulation: Put simply, this is just a table showing the frequencies for each variable across each sample. For example, the following table gives a cross-tabulation of modal verbs across 4 genres of text (labelled A, B, C, and D).

Modal VerbGenre
ABCD
can2101485989
could120493623
may100861546
might2429134
must43341228
ought3401
shall124010

intercorrelation matrix: This is calculated from a cross-tabulation (see above)and shows how statistically similar all pairs of variables are in their distributions across the various samples. The table below shows the intercorrelations between can, could, may, might, must, ought and shall taken from the table above.

WordPEARSON PRODUCT MOMENT CORRELATION COEFFICIENT
cancouldmaymightmustoughtshall
can10.5440.7980.7650.7960.7170.118
could0.54410.1860.7820.8070.5280.026
may0.7980.18610.5210.6370.5540.601
might0.7650.7820.52110.7950.5870.032
must0.7960.8070.6370.79510.8160.306
ought0.7170.5280.5540.5870.81610.078
shall0.1180.0260.6010.0320.3060.0781

The closer the score is to 1, the better the correlation between the two variables. The relationship between can and can is 1, as they are identical. Some variables show a greater similarity in their distributions than others: for instance, can shows a greater similarity to may (0.798) than it does to shall (0.118).

non-parametric test: All statistical tests of significance belong to one of two distinct groups - parametric and non-parametric.

normal distribution: A variable follows a normal distribution if it is continuous and if its frequency graph follows the characteristic, symmetrical, bell-shaped form in which all the values of mean, median and mode co-incide (see graph on the left).

Type I and Type II errors: Although we can be confident that the results of a significance test are accurate, there is always a small chance that the decision made might be wrong. There are two ways that this can occur: