Textual and extra-textual information

The most basic type of additional information is that which tells us what text or texts we are looking at. A computer file name may give us a clue to what the file contains, but in many cases filenames can only provide us with a tiny amount of information.

Information about the nature of the text can often consist of much more than a title and an author. Click here for an example of a document header.

These information fields provide the document with a whole document header which can be used by retrieval programs to search and sort on particular variables. For example, we might only be interested in looking at texts in a corpus that were written by women, so we could ask a computer program to retrieve texts where the author's gender variable is equal to "FEMALE".