Lecture 20: Type-Checking for SQLizability

18 March 2010

Review of what LINQ can do in C# to lower the impedance mismatch between general-purpose programming languages and database query languages. Advantages and limitations on this; in particular the possibility of runtime failure, limits of abstraction, and the role of reflection.

One possible route to improving this: introduction of a type and effect system to track behaviour that cannot be represented in SQL. This allows arbitrary use of parametrization and higher-order functions, while providing compile-time guarantees of on-database execution. Outline of proof via a strongly-normalizing rewrite system. Several examples of how this more deeply embeds querying into a functional language, allowing modular construction of code and opening up queries to conventional compiler rewriting as well as SQL database-specific optimisations.

This lecture is based on the work of Ezra Cooper, in particular the following paper:

You can follow his research blog for more on this.

Homework

Please fill out and submit an anonymous feedback questionnaire. Thank you.


2010 Turing Lecture

15 March 2010

Embracing Uncertainty: The New Machine Intelligence

Christopher Bishop
Microsoft Research Cambridge
5.00pm / 5.30pm Thursday 18 March 2010
Appleton Tower

Computers are traditionally viewed as logical machines which follow precise, deterministic instructions.

The real world in which they operate, however, is full of complexity, ambiguity, and uncertainty. In this year’s Turing Lecture, Professor Chris Bishop discusses the field of machine learning, and shows how uncertainty can be modelled and quantified using probabilities.

He looks at the recent developments in probabilistic modelling which have greatly expanded the variety and scale of machine learning applications, and he explores the future potential for this technology.

In honour and recognition of Alan Turing’s contribution in the field of computing, the IET and the BCS established the Turing Lecture in 1999. It is a world leading event, presenting a topic from current research in computer science given by an acknowledged expert in the field.

Professor Bishop is Chief Research Scientist at the Microsoft Research Laboratory in Cambridge, and also holds a Chair in Computer Science in The University of Edinburgh School of Informatics. He presented the 2008 Royal Institution Christmas Lectures Hi-Tech Trek — The Quest for the Ultimate Computer.

He’s an excellent speaker, and this looks to be an interesting talk about recent advances in and applications of machine learning. There is a reception at 5pm, with the lecture at 5.30pm, and a ticket-only event afterwards. The lecture is free, but the IET ask for registration; which in turn means you need to create an account at the IET website; which means handing over address, phone number, eye colour, etc. Sorry about that.

Links: Registration; Video of this lecture in London; The British Computer Society on the Turing Lecture; The Institution of Engineering and Technology on the Turing Lecture.


Lecture 19: Heterogeneous Metaprogramming in F#

15 March 2010

General overview of metaprogramming, with a range of examples in different languages ranging from C macros through Java reflection to MetaOCaml. Brief summary of the F# language, its history, features, and upcoming release in VS 2010.

Metaprogramming in F#, and how it can be combined with LINQ for database queries, runtime code-generation, and outsourcing computation. How to run Conway’s Life on a GPU without changing your code. This is based on the following paper:

To find out more about this, try also reading the series of articles about accelerating data-parallel code in F# on Tomáš Petrícek’s blog.

Finally, a job ad to work with the F# team.

Links: Slides; F# Developer Network; F# at Microsoft Research; Visual F# Developer Library; Don Syme as Geek of the Week.


Lecture 18: Bridging Query and Programming Languages

11 March 2010

How the LINQ framework for Language-Integrated Query aims to reduce the impedance mismatch between programming languages and query languages.

General background on Microsoft’s .NET Framework: it’s a large platform for program development; as part of this, it has some interesting programming language features. In particular its support for working in multiple languages, exchanging strongly-typed data and code at a high level.

Review of standard SQL-query-as-a-string technique in Java and (almost identically) in C#. Advantages, limitations. What LINQ does to lift some of the limitations. There is convenient SQL-style syntax; but that’s a distraction, the key advance is to connect the semantics of the two language domains. This then brings in type checking, smart IDEs, compiler optimisations, automatic query bundling, abstraction of query constructors, query constructor constructors, user-extensible query libraries, etc. etc.

How this requires (and contributes) a sackful of additional language features, most taken from existing research languages, which themselves have wider application; LINQ as a Trojan horse.

The end result is that a LINQ programmer can write a simple boolean test in C#, or any .NET language, and use it to filter all kinds of data: an array in C#, a table in SQL, or a tree in XML. All being well, LINQ will inspect the semantics of the underlying expression and convert it to the right domain.

Next week: what else it can do, when it doesn’t work, and how that might be fixed.

Link: Slides

What’s in a Name?

During the lecture I referred to the work of Mike Just and others on knowledge-based authentication. There is a discussion of this on the Light Blue Touchpaper security blog, and you can also read the recent published paper and some slides from a recent presentation.

Update: Also appears on BBC News, Telegraph, etc.


Lecture 17: Using SQL from Java

8 March 2010

Back to regular lectures after last week’s guest speakers.

The next set of lectures are about ways to integrate access to domain-specific languages into general-purpose programming languages. This issue of cross-language integration arises in a variety of settings, and often the power and features of a language become diluted or lost when upon mixing with another.

As a running example we’ll use database access through SQL, as driven from a general-purpose language like Java.

This lecture covered some of the context for cross-language working, notably the security risks of HTML/Javascript/SQL injection, both malicious and inadvertent. Various examples: SQL for SkyServer; Google Buzz XSS vulnerability.

Basic SQL access from Java, via JDBC/ODBC. Structured SQL treated as flat strings; some additional structure through prepared strings.

Link: Slides.

Homework

Have a look at these two tutorials on database access in Java and C#.

You don’t need to work through every detail, but the key is to see how these languages provide control of SQL.

Twitter have a Scala library called Querulous for connecting to databases:

Look at the basic usage examples to see what Scala language features they use to simplify construction of correct SQL.