Lecture 15: Information Retrieval

Title slideFollowing the rectangular tables of relational databases and the triangular trees of semistructured data, the remaining Inf1-DA lectures will address the representation and analysis of more unstructured data. Today’s lecture provided a brief introduction to the classic information retrieval task of searching a large collection of documents to find those that match a simple query.

The focus here is not on specific algorithms or data representations, but on specifying the problem, how to recognise when you have a solution, and how to rate the performance of different competing solutions. In this case that means distinguishing between precision and recall in information retrieval; considering how each might be important in different problem domains; and the use of blends like the F-score to combine both measures.

The lecture finished with material on IBM’s Watson system performing general question-answering on Jeopardy!; searching for patents; analysing the vibe of a brewery; and recommending recipes with the Cognitive Cooking Truck.

Link: Slides for Lecture 15


These are all on Watson. I am thoroughly impressed by this project, and its synthesis of many ideas from across informatics, from natural language processing and knowledge representation through to information retrieval and parallel algorithms. To find out more on the technology behind Watson, download IBM’s 16-page Red Book on the topic.

IBM DeepQA Research; IBM Watson; multiple Watson videos.
Watson publicity video from the lecture, and the brief practice round.
Patent Fox
Watson applied to a classic information retrieval task: Patent Fox. This was the entry from University of California, Berkeley into a competition for Watson applications.
Brewery Master
See blog article and video about this from Watson in the Wild.
The Cognitive Cooking Truck, and more from the Institute of Culinary Education about the food truck.