Slides : Recording |

This final lecture went through two more questions from last year’s Inf1-DA exams. The slides give a guide to solutions and in the recording there is also information on how to mark and grade your own answers.

(more…)

I’ve posted notes for this week’s tutorial, on Statistical Analysis, to the tutorial web page.

A number of tutors have offered to run an additional exam preparation tutorial, using a mock paper from an earlier year. To participate you would work through that paper over the next couple of weeks; send your written script to a tutor; they mark it; and you get it back for discussion and feedback in a tutorial during revision week at the end of April.

Tutors are still setting this up: I’ll announce details when that’s done and you will be able to sign up if you are interested.

*Link:* Tutorial Exercises

All Course Surveys |

Please spend time giving written feedback on Inf1-DA in the online survey. This is organised centrally by the University, with all results sent to the individual course organiser and to the Director of Teaching. Submissions are anonymous. I read every comment individually, and for Informatics courses we post your advice online to help other students choosing courses for the future.

This is the final set of tutorial exercises. These involve analysing and looking for correlations in data collected from a survey among last year’s Inf1-DA students.

*Link:* Tutorial Exercises

I’ve posted some notes on solutions to the practice exam questions from this week’s tutorial exercises. These are based on the sheets you had at the tutorials, but without the detailed mark counts and with improvements following feedback from students and tutors. In particular I’ve tried to be more precise and informative about:

- The choice of tables to use when modelling the ER diagram for Q1(c);
- Different ways to write the XPath query for Q2(c)(i) on lines spoken by Rosencrantz.

The final set of tutorial exercises, for Week 11, will be online later today.

*Link:* Tutorial Exercises

Slides : Recording |

This lecture followed on from Friday’s in looking at the use of hypothesis testing to detect correlations in data. The first section examined the *χ² test* for working with qualitative data, using two demonstration examples: possible correlation between coursework submission and exam grades; and the discovery of *collocations* in large text corpora.

The second part of the lecture looked in more detail at some of the risks in misapplying statistical tests. Hypothesis testing can be a tremendously sensitive and powerful tool for discovering new science and identifying the connections between events. However, when used poorly it becomes misleading and unhelpful. The lecture covered a range of concerns about these risks: confusing correlation with causation; what *p*-values can tell us and what they can’t; when statistical “significance” is really about being statistically *detectable*; *p*-hacking, data dredging, outcome switching; and the current *replication crisis* in some experimental sciences. There is also hope and success, though: in the discovery of robust results through meta-analysis; the active discussions around reproducibility and predictive power in scientific research; and the many projects to record trials, replicate results, and improve publication of both negative and positive outcomes.

(more…)

Slides |

Today’s lecture presented the idea of *correlation* in data sets: observing correlations through scatter plots; measuring them with the *correlation coefficient*; and using *hypothesis testing* to see whether that gives evidence to distinguish them from chance coincidence. In this way we get increasingly more precise and sensitive measures for detecting correlation.

Although, remember: *correlation does not imply causation*. More on that next time.

(more…)

Notes and solutions for the *Information Retrieval* tutorial are now online.

I have rearranged the content of the remaining two tutorials. The strike action means there is no written assignment this year: a practice exam paper which would normally be reviewed in Week 11. Instead of this, the tutorial exercise for next week is two practice exam questions and in the tutorial itself you will work through assessing your solutions with the tutor using the original examiners’ marking guidelines.

The final tutorial exercises will be in the following week on the topic of *Statistical Analysis*. By that time I will have addressed the necessary syllabus content in lectures.

*Link:* Tutorial exercises