Every *well-formed* XML document is neatly arranged as a tree, with names for element nodes and all their attributes. This is enough for basic tools to correctly transmit and process XML; but for many applications it is useful to add more precise domain-specific constraints that we expect documents to satisfy. For this we have XML *schema languages*: specialised languages for describing types of XML document. This lecture covered one in particular, the *Document Type Definition* language DTD.

Continue reading Lecture 10: Structuring XML

# Category Archives: Lecture log

# Lecture 9: XML

From the strict rectangles of structured data to the more generous triangles of *semistructured data*. This lecture gave an overview of what might qualify data as semistructured; trees in general as a mathematical model of data; the particular form of trees in the *XPath data model*; and their textual respresentation in XML — the *Extensible Markup Language*.

Finally, some examples of real XML data: from musical scores to financial trading.

Continue reading Lecture 9: XML

# Lecture 8: SQL Queries

Today’s lecture was the final one on *Structure Data* and covered a range of database topics: *ACID* properties for transactions; the *NoSQL* movement; nested SQL queries, set operations, and aggregate queries; ultimate physical limits to computation; the wonders of nature captured in *SkyServer*; and the idea of doing scientific research and experiments from inside the database.

Continue reading Lecture 8: SQL Queries

# Lecture 7: SQL

This lecture introduced the basic structure and format of SQL queries: `SELECT … FROM … WHERE …`

. That’s enough to write a huge range of queries, from single summary statistics to large integrated views that bring together multiple tables.

Continue reading Lecture 7: SQL

# Lecture 6: Tuple Relational Calculus

Another day, another language. This one is the *Tuple Relational Calculus* for specifying queries that describe information to be extracted from the linked tables of a relational database. There’s a separation of roles here: the tuple relational calculus is good for succinctly stating what we want to find out; while relational algebra from the last lecture describes how to combine and sift tables to extract that information from the data. We distinguish *what* information we want from *how* to compute it.

Continue reading Lecture 6: Tuple Relational Calculus

# Lecture 5: Relational Algebra

Tuesday’s lecture presented a mathematical language for slicing and dicing the structured tables of the relational model: selection, projection, renaming; union, intersection, difference; cross product, join, equijoin and natural join. A key feature of this *relational algebra* is that just six of these operations are enough to capture an extremely wide range of queries and transformations of data. Database implementors work hard to build highly efficient engines to carry out these operations, which can then support many different kinds of user application.

Continue reading Lecture 5: Relational Algebra

# Lecture 4: From ER Diagrams to Relational Models

Now that we have both the high-level visual language of ER diagrams and the more formal structures of the relational model, this lecture presented some recipes for translating from the first into the second. This isn’t always an exact match, and for any particular ER diagram we might go back to its original source to decide how to best represent it as a relational model. Even so, this kind of step-by-step staging towards a fully formal representation is an effective route to capturing the subtleties of real-world systems.

Continue reading Lecture 4: From ER Diagrams to Relational Models

# Lecture 3: The Relational Model

This morning’s lecture introduced some refinements to the Entity-Relationship modelling of Lecture 2 and then set out the basics of the *Relational Model* for structured data. Where ER diagrams aim to give a conceptual language for describing things as they are, and have applications well outside databases for general organisation and management, the relational model is explicitly intended as a mathematically precise scheme for the computer-assisted building and querying of large datasets.

Continue reading Lecture 3: The Relational Model

# Lecture 2: Entities and Relationships

Today’s lecture introduced the basics of *Entity-Relationship Modelling* and in particular the graphical language of *ER diagrams*. This plays an important role in planning and structuring databases, bridging the gap between informal concept design and the logical precision required for machine implementation.

Continue reading Lecture 2: Entities and Relationships

# Lecture 1: Introduction

This morning’s opening lecture for the course presented a general overview of the topics in *Inf1-DA*, some of the motivations behind them, and information about practical arrangements. You can read all this in the slides which are also available by clicking on the title slide image to the right.

Continue reading Lecture 1: Introduction