In November 2014 the remaining end user vestiges of the legacy school database service originating in 1996 were migrated into Theon. The areas and workflows that were transitioned at this time were:
- Staff Records
- Visitor Records
- Post Application Visit Day handling for UG students
- Taught PG student admissions handling
The School InfHR team were moved across onto Theon in November to manage the Staff and Visitor records. The PAVD and PGT handling processes have started being used in earnest this week by the ITO, although there is some remedial work remaining.
The PAVD/PGT move was the final nail in the coffin for the old terminal based menu reporting system used by the ITO called “genrep” and this and its host server will shortly be completely decommissioned.
The Staff/Visitor move was a significant shift for InfHR as it saw the primary source of data for both being shifted to upstream IS services, rather than being locally re-keyed. In the case of Staff records the central IS HR/Oracle system is now the golden copy and in the case of Visitor records the central IS VRS system is the golden copy. Due to limitations in the HR/Oracle dataset we supplement this data with a feed from the central IS IDMS service. This is also used to correct data in the VRS dataset which has some IS acknowledged issues. This transition aligns InfHR and our Staff/Visitor records with the ITO/IGS which have been using central golden copy data sources (EUCLID/EUGEX) since the introduction of Theon in late 2010.
With all the legacy data now migrated (or replaced) in Theon the process of purging and archiving the data from the old database system (continuing to comply with University retention periods and data protection regulations) and decommissioning has begun.
In order to maintain the integrity of our systems we need to reboot all DICE desktops over the next few days. We understand that reboots can be very inconvenient so you can be assured that we will only ever schedule reboots which we consider to be essential.
A delayed reboot has been scheduled for all DICE desktops. Student lab machines will be rebooted overnight. For office machines the delay will be 5 days. Although the reboots are delayed, it would be greatly appreciated if people could manually reboot their machines at their earliest convenience; the delayed reboot would then be cancelled.
If you have any queries they should be submitted via the Support Form.
As described in my November posting, we’re making some changes to our OpenVPN configuration, partly to support new devices, partly to add capacity, and partly because one of the features we currently use is being phased out.
The server-end changes are now all in place, as are the new configuration files needed on the client end to use them. Please fetch these by following the links from the computing.help site. The new files should replace any existing configuration files you have on your machine. You may then have to stop and restart OpenVPN to have these new versions picked up.
There are platform-specific versions for Windows, Mac, iOS and Android, and generic versions which should work for Linux and *BSD. Please fetch and install platform-specific versions if possible, as these may contain additional settings to give smoother operation in some cases.
The current plan is to phase out the old endpoints and configurations around the end of March (specifically March 31st), when we’ll recover the IP address space for reuse elsewhere. Old client configurations won’t work with the new endpoints without some adjustment. Look out for further announcements and reminders nearer the time.
Please provide feedback through the support form in the usual way.
When the Forum meeting rooms were set up, we configured the wired network in them to use the “office self-managed DHCP” subnet. This seemed appropriate at the time, as most users were expected to be Forum occupants. The ground-floor meeting areas, in contrast, were set up to use a separate “conferences” subnet, outside the Informatics perimeter filters for better isolation of any problems.
We now propose to reconfigure the meeting rooms to match the ground floor, as the rooms are now more used by non-Informatics people. It will also give a more consistent experience, and may make the auditors slightly less unhappy. The Appleton Tower meeting rooms were changed like this some time ago.
Wireless users will be completely unaffected by this. The only visible effect will be that wired users in the meeting rooms will have to use OpenVPN or one of the login servers to access non-public Informatics resources (so matching what wireless users already do).
It’s planned to make the change first thing on Thursday of next week (22nd), before any of the scheduled meetings start.
On the evening of 6th January, one of the disks in the IBM storage array used to provide storage for KVM guests at the Forum decided to go faulty. As with other storage arrays, this should not have been a major issue. One of the hot spares should have been brought on-line and the array rebuilt using that drive. Unfortunately, the drive failing locked up the whole array such that it stopped responding on the SAN (block requests) and LAN (management interface). Fortunately we have the ability to power-cycle most server equipment from home, so the array was duly power-cycled and brought back into service, with one of the hot spares kicking into life and the array being rebuilt. A small number of KVM guest servers didn’t automatically recover from this so were rebooted first thing on the 7th January.
IBM asked for some logs to be sent to them for analysis, but apparently this didn’t show up the reason for the array lockup.
As many of you are aware, the standard versions of various developer tools provided in Scientific Linux 6 (e.g. gcc) have now become quite old. To gain access to newer versions you can now have various software collections added to your system. These are extra packages provided by Redhat, details are available on the computing help site.
As part of the work necessary to provide access to more of these software collections we have upgraded the devtoolset collection from version 2 to 3. If you are currently using this collection to get a newer version of gcc you must change your scripts after your system has applied updates overnight (Wednesday 7th to Thursday 8th January). The specific devtoolset name has changed so it will now be activated like:
scl enable devtoolset-3 bash
(note the change from devtoolset-2 to devtoolset-3). Apologies for forcing this incompatible change at short notice, we hope that the changes we’ve made will allow us to avoid this pain in the future. The benefit of this change is that gcc will be upgraded to version 4.9.1.
As usual, if you have any queries about this please contact the Computing Team via the support form.
We have now resolved the hardware problems with the new hardware for the staff NX server so we can reschedule the planned upgrade.
On Tuesday 6th January we plan to replace the staff NX server named central which hosts
staff.nx.inf.ed.ac.uk with a machine named northern.
All that will happen is that at about 09:00 on Tuesday we will change the DNS aliases to point to the new machine. This change can take some time to propagate so we will not switch off access to central immediately. It will be left running as normal until 12:00 Friday 9th January. This should allow sufficient time for users logged in to finish their existing sessions and move to the new server.
The IP address for the service will change from
22.214.171.124, your NX client may warn you about this change and request verification. For reference the new RSA host key fingerprint is:
More information regarding the NX service can be found on our help pages. If you encounter any problems accessing the NX service please contact us via the Support Form.
We have recently discovered a problem with the cron daemon (cronie) which is supplied with SL6 and SL7 which means that users with home directories stored in AFS have been prevented from using the service on DICE machines. We have now patched the code and are satisfied that normal service has now been resumed. This bug was introduced as part of the SL6.5 upgrade which occurred during September and October. This wasn’t spotted quickly because normally when a cron job fails the user will get an error report via email but due to the nature of the bug all user cron jobs were silently failing.
Apologies for any inconvenience caused, we have now put in place better monitoring of this service so we should catch any future problems a lot more quickly.
For those interested, this problem was introduced when a security hole was fixed. The change in question is recorded in the Redhat bugzilla as #697485. The change was to drop privileges (i.e. go from root to the user who owns the crontab) before reading the crontab which is clearly a sensible thing to do. A piece of code used elsewhere in cronie was reused to drop privileges, rather annoyingly it has an unnecessary secondary function which is that it insists on being able to change into the user’s home directory. With AFS the home directory is usually inaccessible (even to the user which owns the directory) as there are no Kerberos tickets or AFS tokens available at this stage in the session. There are later checks on the ability to access the home directory which can be worked around by setting the
HOME environment variable to a directory in the local filesystem but that doesn’t work in this case since it fails before the crontab has been parsed.
Earlier this year we conducted a survey of VM/cloud usage. The project has now concluded and a formal report on the project can be found on the Rat Unit wiki This includes a link to the survey report itself. Copies of all the documentation relating to the project can be found in afs in the /afs/inf.ed.ac.uk/group/rat-unit/projects/vm_cloud_survey directory.
It’s now over ten years since we first set up our OpenVPN service, and things have moved on quite a bit since then. So far we have managed to maintain compatibility for existing users of the service, but we would now like to make a couple of enhancements which unfortunately do require incompatible changes:
- We would like to offer a service to users of Android and iOS mobile devices. However, the way we set up the service access keys (back at the beginning, when that was the only way to do it) is not compatible with the way these devices now require things to be done.
- Due to the ever-growing popularity of the service we need to expand the IP address space used so as to avoid unexpected glitches for users.
As this is necessarily an incompatible change, we’ll arrange things as follows:
- We’ll set up new endpoints to provide the new service for testing. (This has actually already been done.)
- We’ll create suitable new configuration files for beta-testers.
- Once we’re happy that things are running as expected, we’ll tidy up, document and advertise the new configurations.
- Some time later (probably around Easter next year) we’ll close down the old-style service.
We’ll also take the chance to remove some now-deprecated options from the configurations, and we’ll add some platform-specific enhancements where these appear to be generally useful.
We do have to turn off the old service in due course, rather than just leaving it running, as this will allow us to recycle the IP address range it uses. Globally-routed IPv4 addresses are now a scarce resource, and we simply can’t justify keeping these for what will be an ever-decreasing number of users.
Look out for announcements regarding the introduction of the new service, specific mobile devices, and in particular the schedule for the retiral of the old service.
Meantime, if anyone would like to beta-test the new service, please get in touch through the support form in the usual way.