Lecture 10: Structuring XML

Title slideEvery well-formed XML document is neatly arranged as a tree, with names for element nodes and all their attributes. This is enough for basic tools to correctly transmit and process XML; but for many applications it is useful to add more precise domain-specific constraints that we expect documents to satisfy. For this we have XML schema languages: specialised languages for describing types of XML document. This lecture covered one in particular, the Document Type Definition language DTD.

A DTD is a little like a type from a programming language: we can check that a value has a certain type, and a function may require arguments of a certain type; similarly we can validate an XML document against a schema, and some processing operation may require as input an XML document matching a certain schema. However, a single XML document may routinely match more than one schema — there is no concept of “the” schema for a document — and XML schema languages often appear more complex than familiar type systems.

This lecture set out the details and usage of XML DTDs, and also how the content of a relational database can be transmitted through XML (and why). There were also announcements about the EUSA Teaching Awards and Innovative Learning Week, with an extended diversion on Unicode and the history of character sets.

Link: Slides for Lecture 10

Homework

  1. Find out about Postel’s Law: what it says, what that means for computer languages and protocols, and what people think about it.

  2. Look inside the XML of SVG, .docx, and one of the other specialized XML formats. See Tuesday’s lecture for some ideas and instructions.

Miscellany

ILW 2015_edited-2