Category Archives: Theon PostgreSQL Schema Management

Articles related to using and developing Theon, a PostgreSQL Schema Management tool.

Adopting Theon – Part 3.4

The first part of this series discussed why you might use Theon as a Schema Management Tool for your PostgreSQL database. The second part part covered the necessary packages and installation. This final part is an introductory tutorial.

In the first part of this tutorial we created a database and started to manage it in Theon.  In the second part we configured TheonCoupler to load data automatically into the database. In the third part we configured TheonUI to allow access to the database and made a custom desktop. In this fourth and final part we will package and distribute the result.

The Theon toolkit has some commands to build and package a ModelLibrary for distribution. The resulting distribution will then have the toolkit as a pre-requisite.

We need to do some work with Git first, as the ModelLibrary is held as a Git repository.

rm -fr derived
git add factory schemat
git commit --message "first pass"

Now we can build a release (which will have version number 1.1 by default).

ttkm release

Check the result.

ttkm display
* >
  Establishing UnitArchive
  - Checking for UnitArchive in Folder[CWD]
    - Found Unit[packages] in Folder[CWD]
    - This[packages]
* packages[CWD] >
  History of Unit[packages]
  - Release Directory:         CWD::.release
  - Release Directory Index:   CWD::.release/index
  - Release SpecFile:          CWD::packages.spec
  - Current Release Version:   packages-1.1
  - Current Release Revision:  07f1a86d37bc74a6b5a2bd0c5157650c60501b3a

Now build a source and binary package for the current release above and list the results.

ttkm package rebuild
ls package
packages-1.1-1.src.rpm                   packages-theon-library-devel-1.1-1.noarch.rpm
packages-theon-library-1.1-1.noarch.rpm  packages-theon-library-ui-1.1-1.noarch.rpm

Install the packages.

rpm -U package/*.noarch.rpm

Now we can test the installation, first drop our existing live database, change directory and install from distribution.

dropdb packages
cd ~/
ttkm packages install

Feed in the data in the same way but using the coupler configuration in the distribution.

rpm -qa --qf '"%{NAME}","%{VERSION}","%{RELEASE}","%{ARCH}","%{BUILDTIME}"\n'| ttkm packages stream agent rpm refresh

rpm -qa --qf '["%{NAME}","%{ARCH}","%{REQUIRENAME}","%{REQUIREVERSION}"\n]'| ttkm packages stream agent rpm refresh to packdep couple packdep

Finally start up TheonUI to access the content using the configuration and metadata in the distribution.

ttkm packages server ui

Note that the server configuration is built for the distribution using the LiveProfile attached to the ModelLibrary at build time. It is more than likely that  in most distributions that this would ultimately be separately configured.

There is more extensive documentation on Theon, TheonCoupler, TheonToolkit and TheonUI in the The Theon Book.

Adopting Theon – Part 3.3

The first part of this series discussed why you might use Theon as a Schema Management Tool for your PostgreSQL database. The second part part covered the necessary packages and installation. This final part is an introductory tutorial.

In the first part of this tutorial we created a database and started to manage it in Theon.  In the second part we configured TheonCoupler to load data automatically into the database. In this third part we will configure TheonUI to allow access to the database and make a custom desktop. In the fourth and final part we will package and distribute the result.

Clear out existing derived content and generate the TheonUI metadata content and configuration file.

rm -fr derived
ttkm derive ui

Start up a server instance using the just generated configuration file and content.

ttkm server ui using config=derived/ui/default/ui.conf

On a browser running on the local machine go to localhost:8000/ui.html to access our instance of TheonUI. Once there we can access  the data in the package or rpmdata tables as follows.

From the Choose Desktop menu select Browser.
From the Choose Cluster menu select All Clusters.
From the Choose Table menu select package.
Then the content will be accessible.
Choose Use Grid option to see the content in a tabular form.

We can construct a desktop to show content from multiple related tables simultaneously. However at the moment our database is too simplistic for this. So first we will add another table which is a child of package and create a stream to load it with data. All aspects of this have been covered in part one and part two of this tutorial so much of the detail will be glossed over here.

Create a new table related to package and a corresponding data source table, import the new structure into Theon, then undo our changes and upgrade from Theon.

psql packages <<EOF
CREATE TABLE packdep ( name TEXT, arch TEXT, requires TEXT, version TEXT );
CREATE TABLE rpmdeps AS SELECT * FROM packdep;
ALTER TABLE package ADD PRIMARY KEY (name,architecture);
ALTER TABLE packdep ADD FOREIGN KEY (name,arch) REFERENCES package (name,architecture);
EOF
ttkm import
psql packages <<EOF
DROP TABLE packdep;
DROP TABLE rpmdeps;
ALTER TABLE package DROP CONSTRAINT package_pkey;
EOF
ttkm export
ttkm upgrade

Now modify the existing rpm Stream to add a new Couple with a custom source table and feed of data. Start the TheonUI first.

ttkm self server ui

Now enter the necessary data using a locally running browser pointed at localhost:8000/ui.html.

Open the Streaming desktop.
On the first panel (Choose Model) ensure the right model is shown (in this case packages should be the only one) and then open the Stream panel and add one new row into the Couples box:
 Position SourceEntity TargetEntity Name[Info]
 2 rpmdeps packdep rpmpackagedeps
Open the Couple panel, make sure the rpmpackagedeps couple is shown and enter and save the following for the couple (keep quotes):
 Source Origin Value = 'RPM'
 Reload = Pipe
 Then add two new rows in the Bonds box:
 Position Type
 1 Handle
 2 Column

Open the Bond panel and add rows in the Pairs box for each bond type as below:
 For Handle:
 Position SourceColumn TargetColumn
 1 name name
 2 arch arch
 3 requires requires
 For Column:
 Position SourceColumn TargetColumn
 1 version version

For the couple we have defined the source table (rpmdeps) and target table (packdep). We have also defined the handle pair (name:name + architecture:architecture + requires:requires) to tie records together between source and target.

Now exit the server process with Control-C.

Export the modified schema and upgrade the live database structures.

ttkm export
ttkm upgrade

Update the stream configuration.

ttkm derive coupler

Run the stream, load the data and process the new couple to load the target table content. Then check that content.

rpm -qa --qf '["%{NAME}","%{ARCH}","%{REQUIRENAME}","%{REQUIREVERSION}"\n]'| ttkm stream agent rpm refresh to packdep couple packdep
psql packages -c "SELECT arch, COUNT(*) AS total FROM packdep GROUP BY arch;"
  arch  | total 
--------+--------
 (none) |      1
 i486   |     99
 noarch |  17006
 i686   |   2631
 x86_64 |  63068
(5 rows)

Now we have some related tables and data we can construct a desktop in TheonUI that in one panel allows package selection and shows the related package dependencies. Start up TheonUI for the Theon management database.

ttkm self server ui

Now enter the necessary data using a locally running browser pointed at localhost:8000/ui.html.

Open the Reflection desktop.
On the first panel (Choose Model) ensure the right model is shown (in this case packages should be the only one).
Add a new row in the Desktops box:
 Desktop-Label = packages
 Name[Info] = Packages
Open the Panel panel and add one new row:
 Panel-Label = package
 Name[Info] = Package
 ? Base Entity = package
 Selector = Scroll
and save the changes. Then add one new row in the Connectors box:
 Position ?JoiningRelationship Type Label
 1 packdep_name_fkey ForwardNormal Requirements
and save the changes. Then add one new row in the Presenters box:
 Presenter-Label Style Height
 Requirements Grid 16
and save the changes. Then add the following rows in the Fields box:
 Position ?Connector ?Attribute Presenter Width Name[Info]
 1 -NONE- name - 80 Name
 1 -NONE- architecture - 20 Architecture
 1 -NONE- version - 20 Version
 1 -NONE- release - 20 Release
 5 Requirements requires Requirements 80 Name
 6 Requirements version Requirements 20 Version
and save the changes.
Open the Desktop panel and add one new row into the Panels box:
 Position ?Panel Link-Label
 1 package Package

Now exit the server process with Control-C.

Export the modified schema and derive the new UI metadata.

ttkm export
rm -fr derived
ttkm derive ui

Before starting the server tweak the configuration manually so that the initial data fetch size is limited. Then start up a server instance using the just generated configuration file and content.

Add at the top of the "config = {" section in derived/ui/default/ui.conf:
  "limit_rows" : 250,
  "limit_rows_conditionally" : True,

ttkm server ui using config=derived/ui/default/ui.conf

On a browser running on the local machine go to localhost:8000/ui.html to access our instance of TheonUI.

From the Choose Desktop menu select Packages.

This is a very simple demonstration desktop. You can use it to lookup a package by name at the top for example and each pre-requisite package and version will be shown in the lower box. Or you can query for a pre-requisite package name and all the packages dependent on it will be shown.

Adopting Theon – Part 3.2

The first part of this series discussed why you might use Theon as a Schema Management Tool for your PostgreSQL database. The second part part covered the necessary packages and installation. This final part is an introductory tutorial.

In the first part of this tutorial we created a database and started to manage it in Theon.  In this second part we will use TheonCoupler to load data automatically into the database. In the third part we will use TheonUI and create a desktop for the database.  In the fourth and final part we will package and distribute the result.

External data can be brought into (or accessed from) PostgreSQL in many ways. Here we will transfer it in at regular intervals from an external source. The command below will act as our live source of data.

rpm -qa --qf '"%{NAME}","%{VERSION}","%{RELEASE}","%{ARCH}","%{BUILDTIME}"\n'

This returns information on all installed packages in a CSV format. We need to create a table in our database to hold this data. In this example it will be the same as our package table, we can’t use that table directly since it is the sync target table – in most cases the external source data is not a direct alignment for an existing table like here. Though in this case, since it is, we can create it by cheating and using our existing table as a template.

psql packages -c "CREATE TABLE rpmdata AS SELECT * FROM package;"

Since this change was made on the live database we need to import it back into Theon.

ttkm import

Next we need to configure the Stream and Couple that will process the data. This needs to be done via TheonUI. So first create an instance of the server for the Theon management database. The default configuration will suffice.

ttkm self server ui

Then using a locally running browser point it at localhost:8000/ui.html to bring up the user interface. Follow the steps below to add the Stream.

Open the Streaming desktop.
On the first panel (Choose Model) ensure the right model is shown (in this case packages should be the only one) and then:
Add a new row in the Streams box:
 Stream-Label = rpm
 Origin = RPM
 ? Feed Entity = rpmdata
 Source = Pipe
 Format = CSV
 Name[Info] = rpm

Above creates a stream agent called rpm which takes data from a standard unix pipe in CSV format and maintains the rpmdata table as a copy of that content.

Now define a Couple to sync data from the source to our target table. We only need one in this instance.

Open the Stream panel and ensure rpm is shown
and then add one new row into the Couples box:
 Position SourceEntity TargetEntity Name[Info]
 1 rpmdata package rpmpackage

Open the Couple panel and enter and save the following for the couple (keep quotes):
 Source Origin Value = 'RPM'
 Then add two new rows in the Bonds box:
 Position Type
 1 Handle
 2 Column

Open the Bond panel and add rows in the Pairs box for each bond type as below:
 For Handle:
 Position SourceColumn TargetColumn
 1 name name
 2 architecture architecture
 For Column:
 Position SourceColumn TargetColumn
 1 version version
 2 release release
 3 buildtime buildtime

For the couple we have defined the source table (rpmdata) and target table (package). We have also defined the unique handle pair (name:name + architecture:architecture) to tie records together between source and target. Finally we have defined the data column pairs, the values of which will be maintained by the stream process.

Now exit the server process with Control-C.

Export the modified schema and review the new content.

ttkm export
[timc@babel packages]$ ls -R schemat
schemat:
live  packages.xsd  process  relation  stream

schemat/live:
packages.xsd

schemat/process:
_th_coupler_package_check.xsd  _th_rpm_package_couple.xsd

schemat/relation:
archcount.xsd  package.xsd  rpmdata.xsd

schemat/stream:
rpm.xsd

The rpmdata table has been added. The stream configuration has been added. Some stream management processes have been created. There will also be some changes to the target table (package) structure (metadata columns and triggers that have been added to support the content being managed by TheonCoupler).

Now we need to upgrade the live database structures. First remove the temporarily created rpmdata table (see part one for why).

psql packages -c "DROP TABLE rpmdata;"
ttkm upgrade

Create the stream configuration for the toolkit and have a look at it.

ttkm derive coupler
cat derived/coupler/rpm.cpl

Now we can run the stream, load the data and process all associated couples (just the one here) to load the database content. Then check that content using the view.

rpm -qa --qf '"%{NAME}","%{VERSION}","%{RELEASE}","%{ARCH}","%{BUILDTIME}"\n'| ttkm stream agent rpm refresh
psql packages -c "SELECT * FROM archcount;"
  arch  | total 
--------+-------
 (none) |     7
 i686   |   111
 x86_64 |  2614
 i486   |     1
 noarch |  1750
(5 rows)

Rows in package will now be added, deleted and updated automatically based on what is in the source table (rpmdata) and ultimately the content of the live data fed in from the rpm command every time the above is run.

Next creating configuration for TheonUI to support easier user access to the database content.

Adopting Theon – Part 3.1

The first part of this series discussed why you might use Theon as a Schema Management Tool for your PostgreSQL database. The second part part covered the necessary packages and installation. This final part will be an introductory tutorial.

In the first part of this tutorial we will create a database, manage it in Theon, make some changes, and use the factory to template some schema.  In the second part we will use TheonCoupler to load data automatically into the database. In the third part we will use TheonUI and create a desktop for the database. In the fourth and final part we will package and distribute the result.

Before doing anything note that we need a running PostgreSQL installation we have access on to create databases with. In this tutorial we use on one localhost with /disk/scratch as the socket connection directory.

First we will create a simple database with one table. This will be used to hold installed packages.

createdb packages
psql packages <<EOF
CREATE TABLE package (
  name TEXT,
  version TEXT,
  release TEXT,
  architecture TEXT
);
EOF

Create the Theon management database.

ttkm create

Now create a directory to hold our ModelLibrary.

cd /tmp
ttkm packages launch at .
cd packages

Import the database we have just created as the schema. In order to do this we need to create a default LiveProfile with the necessary connection details.

ttkm attach using database=packages host=/disk/scratch
ttkm import

Next test what we imported by exporting the schema.

ttkm export
cat schemat/relation/package.xsd

The above is the XML representation of the table we created. We will properly check now by dropping the original database then, by deriving the necessary SQL, recreate it.

dropdb packages
ttkm install
psql packages -c "\d"
         List of relations
 Schema |   Name    | Type  | Owner 
--------+-----------+-------+-------
 public | _th_theon | table | timc
 public | package   | table | timc
(2 rows)

Note the additional metadata table Theon adds. This is not mandatory(and can be removed) but it is useful for subsequent upgrades as we will see shortly.

Next we will define  a view using the factory. To do this we first create the necessary hierarchy and add a template definition.

mkdir -p factory/templates
cat >factory/templates/archcount.fat   <<EOF
%%include "relation/view.grg"
%%equate relation_ViewName "archcount"
%%equate relation_ViewValueBlock "SQL"
%%block SQL
  SELECT
    architecture AS arch,
    COUNT(*) AS total
  FROM package
  GROUP BY
    arch
%%end
EOF

Then we process the factory templates and load them into the Theon management database.

ttkm gather

Next export the changed schema and check.

ttkm export
cat schemat/relation/archcount.xsd

Now we need to upgrade the installed database to add the view. This is automatic as we have the metadata table, manual machinations would be required otherwise.

ttkm upgrade
psql packages -c "\d"
 List of relations
 Schema | Name | Type | Owner 
--------+-----------+-------+-------
 public | _th_theon | table | timc
 public | archcount | view  | timc
 public | package | table | timc
(2 rows)

The view has been successfully added.

Finally we want to add an extra column in the package table for the build time. There are various ways of doing this, here we will alter the live database and re-import the change so the schema in Theon is kept up-to-date. However, in order to keep consistent metadata, we will drop the column and then re-create the column by upgrade. This is necessary in this case as we are working with a development database which is also the live database.

psql packages -c "ALTER TABLE package ADD COLUMN buildtime INTEGER;"
ttkm import
ttkm export
psql packages -c "ALTER TABLE package DROP COLUMN buildtime;"
ttkm upgrade
psql packages -c "\d package"
       Table "public.package"
    Column    |  Type   | Modifiers 
--------------+---------+-----------
 name         | text    | 
 version      | text    | 
 release      | text    | 
 architecture | text    | 
 buildtime    | integer |

The column has been successfully added back by upgrade.

Now it is time to get some data into the database.

Adopting Theon – Part 2

The first part of this series discussed why you might use Theon as a Schema Management Tool for your PostgreSQL database. This part covers the necessary packages and installation. The final part will be an introductory tutorial.

Theon is available from here directly from the relevant Git repositories. Pre-built packages (RPM’s) are also available from here for Scientific Linux 7, but would probably work as-is on comparable distributions, for example RedHat Enterprise Linux, Centos and Fedora.

Theon contains three components. The base component contains the framework the schema for Theon itself, standard transforms, the factory core templates and all the documentation. The Theon Toolkit component contains the command line interface. The Theon UI contains the generic web browser graphical database interface. All of these are needed to use Theon. There are also some development packages, but these are only necessary to make changes to Theon itself.

Theon requires a couple of custom support tools. Gurgle, which is used as the factory template engine. Schematic, which is used to process the XML-Schema files and apply transforms. Both of these are also available from here, either direct from the Git repositories or as pre-built packages.

There are also a number of pre-requisite third party packages required, such as PostgreSQL, Python and various modules. Theon UI uses YUI2 – the necessary software for this has been re-packaged and is also available directly from here.

If you use Yum everything can be obtained from the Theon Yum Repository. In order to use this first create a suitable Yum repository configuration file as below.

cat >/etc/yum.repos.d/theon.repo <<EOF
[theon]
name=Theon Repository
baseurl=http://groups.inf.ed.ac.uk/theon/repo/rpms/os/el7/x86_64/
gpgcheck=0
enabled=1
EOF

Then install the core packages.

yum install gurgle-postgresql schematic yuidoc yuicompressor yui2 theon theon-library theon-library-book theon-library-ui theon-tk theon-ui

Yum will install other pre-requisite software automatically, assuming it is available for your platform.

If you want to use Theon as a PostgreSQL extension module then you will also need the package below, built for version 9.2.23.

yum install theon-library-extension_9.2.23

The following packages are only necessary for doing Theon development.

yum install theon-library-devel theon-ui-devel

Alternatively you can clone the Git repositories and build from source for your target platform. You will need to first install all pre-requisite software (see the “.spec” file). The builds are done simply as part of the package build and so you will also need rpm/rpmbuild available on your platform, or you will need to extract and script the content of the package “.spec” file.

Gurgle
cd /tmp
git clone git://afsgit.inf.ed.ac.uk/timc/gurgle
cd gurgle
./makerpm
rpmbuild -ba ${RPMROOT}/SPECS/gurgle.spec
rpm -U ${RPMROOT}/RPMS/x86_64/gurgle-postgresql-*.rpm
Schematic
cd /tmp
git clone git://afsgit.inf.ed.ac.uk/timc/schematic
cd schematic
./makerpm
rpmbuild -ba ${RPMROOT}/SPECS/schematic.spec
rpm -U ${RPMROOT}/RPMS/noarch/schematic-*.rpm
Theon Toolkit
cd /tmp
git clone git://afsgit.inf.ed.ac.uk/timc/theon-tk
cd theon-tk
./makerpm
rpm -i ${RPMROOT}/SRPMS/x86_64/theon-tk-1.1-1.src.rpm
rpmbuild -ba ${RPMROOT}/SPECS/theon-tk.spec
rpm -U ${RPMROOT}/RPMS/noarch/theon-tk-*.rpm
 Theon
cd /tmp
git clone git://afsgit.inf.ed.ac.uk/timc/theon
cd theon
./makerpm
rpm -i ${RPMROOT}/SRPMS/x86_64/theon-1.1-1.src.rpm
rpmbuild -ba ${RPMROOT}/SPECS/theon.spec
rpm -U ${RPMROOT}/RPMS/noarch/theon-*.rpm
Theon UI
cd /tmp
git clone git://afsgit.inf.ed.ac.uk/timc/theon-ui
cd theon-ui
./makerpm
rpm -i ${RPMROOT}/SRPMS/x86_64/theon-ui-1.1-1.src.rpm
rpmbuild -ba ${RPMROOT}/SPECS/theon-ui.spec
rpm -U ${RPMROOT}/RPMS/noarch/theon-ui-*.rpm

Theon will be installed at the following paths by default.

/usr/share/theon
/etc/theon
/usr/lib/theon
/usr/lib/python2.7/site-packages/theontk
/usr/lib/python2.7/site-packages/theonui
/usr/bin/theontk*
/usr/bin/ttk*

You should now be ready to go.

Adopting Theon – Part 1

Theon is a Schema Version Management system for PostgreSQL databases. So the first question to ask yourself is why you would want to bother with such a thing at all.

Why use Theon?

  • your data model is in constant flux, needs to be adapted frequently and responsively to meet new requirements
  • your data model is large and complex
  • you need to manage different instances of your data model for production and development servers, for handling continuous release/testing cycles, for deployment at multiple organisations
  • many different people are concurrently working on making changes to your data model
  • your data model needs to be kept in alignment with other outside systems, you need it to be automatically maintained based on the requirements of those systems
  • you want to develop your data model at a higher abstracted layer than physical database tables, columns and relationships
  • you or others need to maintain/upgrade external applications so that they remain in alignment with specific versions of your data model
  • others maintain independent systems/applications using your data model and need a robust/automatic upgrade path

If you want/need to do any/all the above then Theon  may help you out. If not then adopting a Schema Version Management system might still be the right thing to do – after all it is considered sensible to version control source code so why not databases, see Database Refactoring) – but you will probably get less immediate benefit.

There are other Schema Version Management systems that support PostgreSQL: Liquibase, Flyway, sqlHawk, MIGRATEdb, schema-evolution-manager, Alembic, Sqitch, DBSteward. All these vary in functionality, complexity and price. Follow the links and make your evaluation of the pros and cons of each depending on your own specific needs. Architecturally Theon is probably closest to DBSteward.

Why choose Theon?

  • open source under GPLv3
  • unix command line scriptable tool chain
  • extensible
  • change sets are just SQL DDL
  • built in graphical schema editor
  • make changes by direct DDL manipulation on development database instances
  • import and go

Why not?

  • immature
  • evolving
  • only supports a subset of the PostrgreSQL feature set

The second part of this article examines what you need to get started using Theon. The final part will demonstrate managing a database in Theon.