All posts by timc

Python Command Scripts

You would think trying to install and package a Python package and module with a command script would be easy. Turns out this is a twisty minefield on top of shifting quicksand just itching to catch the unwary. Found this article though which was most helpful.

Adopting Theon – Part 1

Theon is a Schema Version Management system for PostgreSQL databases. So the first question to ask yourself is why you would want to bother with such a thing at all.

Why use Theon?

  • your data model is in constant flux, needs to be adapted frequently and responsively to meet new requirements
  • your data model is large and complex
  • you need to manage different instances of your data model for production and development servers, for handling continuous release/testing cycles, for deployment at multiple organisations
  • many different people are concurrently working on making changes to your data model
  • your data model needs to be kept in alignment with other outside systems, you need it to be automatically maintained based on the requirements of those systems
  • you want to develop your data model at a higher abstracted layer than physical database tables, columns and relationships
  • you or others need to maintain/upgrade external applications so that they remain in alignment with specific versions of your data model
  • others maintain independent systems/applications using your data model and need a robust/automatic upgrade path

If you want/need to do any/all the above then Theon  may help you out. If not then adopting a Schema Version Management system might still be the right thing to do – after all it is considered sensible to version control source code so why not databases, see Database Refactoring) – but you will probably get less immediate benefit.

There are other Schema Version Management systems that support PostgreSQL: Liquibase, Flyway, sqlHawk, MIGRATEdb, schema-evolution-manager, Alembic, Sqitch, DBSteward. All these vary in functionality, complexity and price. Follow the links and make your evaluation of the pros and cons of each depending on your own specific needs. Architecturally Theon is probably closest to DBSteward.

Why choose Theon?

  • open source under GPLv3
  • unix command line scriptable tool chain
  • extensible
  • change sets are just SQL DDL
  • built in graphical schema editor
  • make changes by direct DDL manipulation on development database instances
  • import and go

Why not?

  • immature
  • evolving
  • only supports a subset of the PostrgreSQL feature set

The second part of this article examines what you need to get started using Theon. The final part will demonstrate managing a database in Theon.

make comments on criteria available

Another (feature creep) request for the new system. It looks likes the mechanism is there and this will be easy to bolt on subsequently in which case it is unlikely to be done with this particular project.

Returning feedback to students wasn’t part of the  original project description, rather a requirement that has arisen in the meantime. But the fact that the reports are now stored is surely a  big step in providing that feedback. When we agreed at Teaching Committee that information from the reports  should be returned to students as feedback, I pointed out that it would probably not be possible to do it in June 2015, but I hope we will have something in place for June 2016.

Achieving this with the new system is probably just a case of Webmark also producing a
reduced version of the form just including the relevant fields students can see and then a specific index page for access by students that shows them that form.

Comment: That would be perfect! The “relevant fields” are:

  • Individual marking form: Comments on the criteria
  • Agreed mark form: nothing
  • Moderator’s form, if any: Comments on the criteria

I guess that an ideal “specific index page for access by students” would  be to include it on but with a time delay so that  it is only accessible once the examiners’ meetings have passed.

Modeling Part Completed

The documentation for TheonModel (managing the physical database schema and related configuration for TheonCoupler, TheonUI and TheonPortal) has been completed and including the illustrations. It is in principle a first draft, but its been through a number of revisions so its a fairly solid first draft. The command toolchain is clearly defined now so we can start implementing that – all the underlying technology has been done already, or at least proven in principle, so its just tidying and wrapping it up properly here to do. The documentation for TheonCoupler is practically done as well now, should be completed within the next few days. Still got architectural documentation to do on TheonUI and TheonPortal to do (bits and pieces on usage and configuration are there already under TheonModel). Need to write the release/package management documentation as well (although the toolchain for that is at least already defined and coded up). Finally the system installation/configuration documentation needs to be done – although that is a little bit in limbo to see whether we can get a viable standalone LCFG config up and running. The remaining documentation will proceed in parallel with getting the TheonModel toolchain implemented as that is really the primary goal of this project.

Started back on Documentation

So after a long break (I think May was the last time this was touched) got back into doing the documentation. It probably took a day just to orientate to where we were. Anyhow the main purpose of doing the documentation before the technical work is complete is to pin down the workflow and hence what the commands and options will need to be provided by the wrapper toolkit. This so far has been quite successful in identifying any gaps in the technical implementation (due to the attempt to reflect real use cases).

Mostly Done

The bulk of this has now been done (the first four goals).

The “Proposal for Submissions” was followed pretty much as-is. The MSc submissions link will now jump to an MSc variant of the UG4 form but using the same underlying mechanism. Both handling flags can be set and materials uploaded at submission time.

The “Proposal for Access to Submissions” was modified slightly. In the end the existing “projsubs/ug4” and “projsubs/msc” folders were retained (rather than new URLs and folders being created) with the modified indexing code. Instead of passing arguments the indexing behaviour is automatically changed based on whether the current year is the same as the year being processed. If not then the old behaviour is retained. If it is then the new behaviour is followed. This new index format for the current (assessment) year is as described, except that to achieve sorting a variation of the “dotable” cgi script is used and a data file is produced instead (mapped to the cgi script by a rule in .htaccess). Some other minor changes were made also to the existing setup (such as indexing in year descending order). The indexing scripts now generate the liveroot files directly (since they have to change the name and extension) rather than via stdout. They also generate a data file into the upload directory which maps the matriculation number to the submitted folder number – this is used by Webmark (see below).

The “Proposal for Webmark Returns” was also modified slightly. The end result is the same though – a PDF for each filled in form and a data file for each mark is copied into the submission upload directory. Instead of using a cc mail alias and remctl a feature was added to Webmark to allow per-output subdirectories in the final file output path, components of which could be literal or set by the value of a form field. An additional source was added which is the data file generated by the indexing scripts above. This allows a “dynamic” output directory (which is the students upload submission directory) to be set against the additional outputs. By this means Webmark can write the files for the indexing scripts directly into their final location (and safely as it is part of the user submission process, which also ensures that the students upload directory is in place before a Webmark submission can be made). A consequence of this though is that it is no longer possible to fill in a blank form for the mark returns – a student must be selected from the drop-down, in practice it would be an error condition if not all the students are listed in the drop-down.

The “Proposal for Public Access” has not been done yet – but will be another modification to the indexing scripts. In this context they will simply produce a list of projects for all years (including the year in the index) with just the student name and project title (which is a link to the copied PDF as at the moment). This list will only include students with distinctions though (as previously described).

To clean everything up the older msc project submissions were moved from their original “infthesis” upload location (where they were mixed up with PhD and MScRes submissions) into the new “mscprojects” location and the indexing scripts re-run for the default three years and then individually all the way back to 2003. The indexing scripts were also manually re run for all the existing UG4 years.