All posts by Ian Stark

Tutorial Notes: Statistical Analysis

Notes on solutions for this week’s tutorial are now online. Tutors are marking coursework submissions, and will return them with feedback in next week’s tutorial.

Thanks to everyone who contributed to the revision topics poll. There was strongest interest in material from the earlier parts of the course, so I’ll pick some past questions in that area to go over in lectures.

Link: Tutorial notes

Lecture 18: Hypothesis Testing and Correlation

Title slideWhere the last lecture was about summary statistics for a single set of data, we now address multi-dimensional data with several linked sets of values among which we might look for correlations. This leads into several more sophisticated questions which are key to the effective application of statistics: how do we identify potential effects like correlation; how do we know when we have found evidence for an effect; and what might this tell us about any causal connections?
Continue reading Lecture 18: Hypothesis Testing and Correlation

Tutorial Exercises: Update

The sheet of exercises for Tutorial 8 posted last night had some confusion about whether Question 2 was measuring sleep or exercise hours. This was my error. It’s now corrected, and there’s a revised version online in the usual place. My apologies: thanks to the student raising this on Piazza, and to Fabian for fixing this up.

Tutorial Exercises: Statistical Analysis

Exercises for Tutorial 8 are now online, as well as notes on solutions for Tutorial 7.

This week you will apply statistical tests to the survey data gathered in the first lecture of the course, looking for possible connections between sleep, exercise, and choice of operating system. This uses techniques to be presented in this afternoon’s lecture and, awkwardly, next Tuesday’s. To support this, I’ve prepared and posted both sets of lecture slides in advance.

Thanks to everyone who submitted their coursework assignment yesterday. These are now going out to individual tutors, who will mark them and give you feedback on your work in the Week 11 tutorial.

Links: Tutorial exercises; Lecture slides

Tutorial Exercises: Information Retrieval

Exercises for Tutorial 7 are now online, together with notes on solutions for Tutorial 6.

Question 1 you can do immediately; Question 2 requires material covered in Friday’s lecture.

The exercises for this tutorial are shorter than those in previous weeks. That’s because you will also be working on the coursework assignment. This tutorial is an opportunity for you to talk about that and ask your tutor questions. Please plan for this: come to the tutorial prepared to discuss your progress on the assignment.

Link: Tutorial exercises

Lecture 15: Information Retrieval

Title slideFollowing the rectangular tables of relational databases and the triangular trees of semistructured data, the remaining Inf1-DA lectures will address the representation and analysis of more unstructured data. Today’s lecture provided a brief introduction to the classic information retrieval task of searching a large collection of documents to find those that match a simple query.
Continue reading Lecture 15: Information Retrieval

Lecture 14: Example Corpora Applications

Title slideCorpora are widely used for computational research into language, and for engineering natural-language computer systems. In linguistics, they make it possible to do real experimental science: to formulate hypotheses about the structure of languages, or changes in language between different places, times or people; and then test these on data. In building applications that handle text or speech, corpora provide the mass quantities of raw material used for machine learning and other algorithms.
Continue reading Lecture 14: Example Corpora Applications