Title slide
Slides : Recording

From the strict rectangles of structured data to the more generous triangles of semistructured data. This morning’s lecture gave an overview of what kind of data is seen as “semistructured”; the idea of trees as a mathematical model of data; the particular form of trees in the XPath data model; and their textual representation in XML — the Extensible Markup Language.

XML also has a large number of domain-specific variants. These are all valid XML, and use standardised sets of element types to give a custom language for representing data relevant to a particular field: from musical scores to financial trading.

Links: Slides for Lecture 9; Recording of Lecture 9

Homework
1. Read This
2. Do This
  • Find an SVG file and open it in a text editor to study its XML content.

  • Find a Microsoft Office .docx file and look at the XML content in that. This format (OOXML) is in fact a zipped archive of XML files, so you will need to unzip it first. Depending on your platform, this may require renaming the .docx extension as .zip

    Link: Wikipedia on Microsoft’s OOXML format

References

To learn more about XML, try any of the following.

Lecture 9: Trees and XML