Lecture 7: Using SQL from Java

Advances in Programming Languages - Lecture 7 - Using SQL from JavaThis lecture begins a series on integrating domain-specific languages with general-purpose languages: today, SQL within Java.

SQL is a domain-specific language for programming queries on relational databases. Queries may be complex, with declarative and imperative components, and are often constructed by programs rather than directly by programmers.

Standard frameworks for creating such queries build them out of plain strings, taking little account of their structure as programs. This can also happen in other settings where a program in one language is used to manipulate code written in another.

This unstructured string manipulation can lead to a range of injection vulnerabilities in such systems. The lecture included a demonstration of this in the SkyServer database, a description of a briefly dramatic Twitter vulnerability, and some recent vulnerabilities in a WordPress plugin and Apple’s Mac OS X server.

The standard technology for managing SQL queries in Java is through Java Database Connectivity (JDBC) libraries. While JDBC is sophisticated in its handling of multiple alternate database backends — local, remote, and multi-vendor — building queries themselves typically falls back on string concatenation.

Use of prepared statements raises the level of abstraction a little: queries with placeholders are filled with suitable values. This is certainly better, but even this argument-passing is still a manual rebuild of a mechanism that is already ubiquitous and expertly managed in the host language. Can we do better?

SQL queries are programs in a rich high-level language; when we treat them as unstructured text we lose much of what we know about programming.

Link: Slides for Lecture 7

Homework

The preliminary report for your coursework assignment is due in at 3pm this Friday, 10 October. If you have questions about this please email me or drop in during my office hour, 1030–1130 on Wednesday.

Link: Preliminary report requirements

References

The examples I showed were all from SkyServer Data Release 7; the most recent version, Data Release 10, appears to have blocked the HTML injection loophole I demonstrated.

The Javascript injection into Twitter, and its self-tweeting tweet, was reported in the press and also by Twitter themselves.

The other SQL injection vulnerability reports were two I picked from standard repositories as being recent examples in familiar software. These are listed in many locations, and I took screenshots of a couple of standard reports. The “CVE” codes are unique identifiers in the “Common Vulnerabilities and Exposures” system.