*correlation*in data sets, and introduced the method of

*hypothesis testing*for identifying whether features observed in samples in fact arise by chance.

For paired series of numerical data we can use the *correlation coefficient*, and for qualitative data the *χ ^{2}* statistic. The lecture included examples of this applied to last year’s Inf1-DA exam results, bigram frequency in the British National Corpus, and possible gender bias in student admissions to Berkeley in 1973.

