Lecture 19: chi² Testing on Categorical Data

Title slideToday’s lecture covered more on hypothesis testing, presenting the χ2 test and working through three examples: student Inf1-DA exam results in 2011, bigram frequency in the British National Corpus, and possible gender bias in student admissions to Berkeley in 1973.

The χ2 test is a tool for assessing potential correlations in categorical data, where it is not possible to apply the correlation coefficient measures used on quantitative data.

Use of χ2 follows the standard pattern for statistical tests: we have a null hypothesis that there is no correlation between different possible categories of data, and the test indicates the probability p that if this were true then we would observe data similar to that actually seen. If p is very low, we reject the null hypothesis and take the statistical result as evidence suggesting a correlation.

Of course, as usual, correlation does not imply causation, but a correlation may lead us to investigate possible mechanisms of causation, which might in turn give rise to predictions that can be repeatedly tested.

This repetition is key to establishing concrete scientific results. Over the last year, concern about the Replication Crisis in psychological sciences has risen following large collaborative attempts to replicate a range of published results in the psychology literature. Many of these replications failed, which is a concern.

Link: Slides for Lecture 19

Homework

The final two lectures, this Friday and next Tuesday, will review the course and address exam preparation. Please fill out the online poll to indicate particular areas you would like covered.

I’m especially interested in specific past exam questions you would like me to review. Please look at past papers and suggest individual questions in the comments section of the poll.

Link: Doodle poll on review topics; Past Inf1-DA exam papers

References

Visualisation
  • Wikipedia on Anscombe’s Quartet.

  • F. J. Anscombe. Graphs in Statistical Analysis, The American Statistician, 27(1):17–21, February 1973.

    Short, readable article advocating the importance of graphing your data before making judgements on it. This is the source of the quartet.

Reproducibility

Berkeley Admissions
  • P. J. Bickel, E. A. Hammel, and J. W. O’Connell. Sex Bias in Graduate Admissions: Data from Berkeley. Science, 187(4175):398–404, 1975. DOI: 10.1126/science.187.4175.398.

    This closely analyses the admissions data to conclude that it does point at serious issues of discrimination, although not quite in the places first indicated.

    Link: Full text via EASE login

    The bias in the aggregated data stems not from any pattern of discrimination on the part of admissions committees, which seem quite fair on the whole, but apparently from prior screening at earlier levels of the educational system. Women are shunted by their socialization and education toward fields of graduate study that are generally more crowded, less productive of completed degrees, and less well funded, and that frequently offer poorer professional employment prospects.

  • Wikipedia page on Simpson’s Paradox, of which the Berkeley admissions is a well-known example.