Last year I wrote about a couple of network security projects which were in their early stages of development. As the last of these has recently completed, I thought it might be useful to summarise their outcomes.
We have had edge filtering in place for a long time, since we ran Solaris on Suns in fact, configured automatically from our machine configuration system (lcfg). This has proved to be very successful in practice. Our main edge routers typically reject a couple of million bogus packets per day, though this is still rather less than 0.5% of their total throughput. We mostly don’t log this in detail, as there’s just too much of it and most of it isn’t very interesting, but we do have a couple of externally-visible machines which log more extensively. These show several thousand scans per day, mostly for various Microsoft services, against individual IP addresses which have not been in use for several years.
The first of the projects I mentioned was “Scanning for Compromised Machines“. After some investigation of our own, we learned that the University would be buying in to the ESISS scanning tool. We now have this in use, regularly scanning all machines (managed and self-managed) with open firewall holes. This has proved to be reasonably successful, and has thrown up a number of cases for further investigation. Where these are with self-managed machines, we follow up with the machine’s manager to have any vulnerabilties closed down.
The other project was a pilot Intrusion Detection System. This was a useful exercise, and the experience gained will certainly be helpful if we do later implement this as a full service, though overall the result was rather less useful than the “Scanning” project for reasons which are listed in more detail in the report. In summary, though, the reports it produces are rather noisy due to our heterogeneous environment, and the rules we use are a couple of weeks or so behind the leading edge so we tend to hear about (and patch!) vulnerabilities through other routes before they start to show in the reports. We’ll leave the pilot system running, so long as it doesn’t interfere with the proper functioning of our network, but there would still be quite a bit of work required to bring it up to production standard, and that effort just isn’t available at the moment as a result of the SL7 upgrades and the Appleton Tower decant.
New Virtual DICE VM images are available for download. A lot of software has been updated, but the big news is that the VMs now include Java. A careful review of the licence conditions suggested that we were after all allowed to do this!
Virtual DICE is the School’s DICE Linux, but running in a virtual machine which you can control. It uses VirtualBox so can run on any supported machine (with enough disk space and memory). To find out more read the Virtual DICE help pages.
The Java software is now also available for the previous Virtual DICE release (hostnames knibbergen and knijff). However the software update may well fail due to lack of disk space on the VM. If this happens, just install the latest Virtual DICE. (The new Virtual DICE VMs have more disk space, and have Java already installed.)
As ever, please contact computing support with any problems you encounter. Thanks.
In November 2014 the remaining end user vestiges of the legacy school database service originating in 1996 were migrated into Theon. The areas and workflows that were transitioned at this time were:
- Staff Records
- Visitor Records
- Post Application Visit Day handling for UG students
- Taught PG student admissions handling
The School InfHR team were moved across onto Theon in November to manage the Staff and Visitor records. The PAVD and PGT handling processes have started being used in earnest this week by the ITO, although there is some remedial work remaining.
The PAVD/PGT move was the final nail in the coffin for the old terminal based menu reporting system used by the ITO called “genrep” and this and its host server will shortly be completely decommissioned.
The Staff/Visitor move was a significant shift for InfHR as it saw the primary source of data for both being shifted to upstream IS services, rather than being locally re-keyed. In the case of Staff records the central IS HR/Oracle system is now the golden copy and in the case of Visitor records the central IS VRS system is the golden copy. Due to limitations in the HR/Oracle dataset we supplement this data with a feed from the central IS IDMS service. This is also used to correct data in the VRS dataset which has some IS acknowledged issues. This transition aligns InfHR and our Staff/Visitor records with the ITO/IGS which have been using central golden copy data sources (EUCLID/EUGEX) since the introduction of Theon in late 2010.
With all the legacy data now migrated (or replaced) in Theon the process of purging and archiving the data from the old database system (continuing to comply with University retention periods and data protection regulations) and decommissioning has begun.
In order to maintain the integrity of our systems we need to reboot all DICE desktops over the next few days. We understand that reboots can be very inconvenient so you can be assured that we will only ever schedule reboots which we consider to be essential.
A delayed reboot has been scheduled for all DICE desktops. Student lab machines will be rebooted overnight. For office machines the delay will be 5 days. Although the reboots are delayed, it would be greatly appreciated if people could manually reboot their machines at their earliest convenience; the delayed reboot would then be cancelled.
If you have any queries they should be submitted via the Support Form.
As described in my November posting, we’re making some changes to our OpenVPN configuration, partly to support new devices, partly to add capacity, and partly because one of the features we currently use is being phased out.
The server-end changes are now all in place, as are the new configuration files needed on the client end to use them. Please fetch these by following the links from the computing.help site. The new files should replace any existing configuration files you have on your machine. You may then have to stop and restart OpenVPN to have these new versions picked up.
There are platform-specific versions for Windows, Mac, iOS and Android, and generic versions which should work for Linux and *BSD. Please fetch and install platform-specific versions if possible, as these may contain additional settings to give smoother operation in some cases.
The current plan is to phase out the old endpoints and configurations around the end of March (specifically March 31st), when we’ll recover the IP address space for reuse elsewhere. Old client configurations won’t work with the new endpoints without some adjustment. Look out for further announcements and reminders nearer the time.
Please provide feedback through the support form in the usual way.
When the Forum meeting rooms were set up, we configured the wired network in them to use the “office self-managed DHCP” subnet. This seemed appropriate at the time, as most users were expected to be Forum occupants. The ground-floor meeting areas, in contrast, were set up to use a separate “conferences” subnet, outside the Informatics perimeter filters for better isolation of any problems.
We now propose to reconfigure the meeting rooms to match the ground floor, as the rooms are now more used by non-Informatics people. It will also give a more consistent experience, and may make the auditors slightly less unhappy. The Appleton Tower meeting rooms were changed like this some time ago.
Wireless users will be completely unaffected by this. The only visible effect will be that wired users in the meeting rooms will have to use OpenVPN or one of the login servers to access non-public Informatics resources (so matching what wireless users already do).
It’s planned to make the change first thing on Thursday of next week (22nd), before any of the scheduled meetings start.
On the evening of 6th January, one of the disks in the IBM storage array used to provide storage for KVM guests at the Forum decided to go faulty. As with other storage arrays, this should not have been a major issue. One of the hot spares should have been brought on-line and the array rebuilt using that drive. Unfortunately, the drive failing locked up the whole array such that it stopped responding on the SAN (block requests) and LAN (management interface). Fortunately we have the ability to power-cycle most server equipment from home, so the array was duly power-cycled and brought back into service, with one of the hot spares kicking into life and the array being rebuilt. A small number of KVM guest servers didn’t automatically recover from this so were rebooted first thing on the 7th January.
IBM asked for some logs to be sent to them for analysis, but apparently this didn’t show up the reason for the array lockup.
As many of you are aware, the standard versions of various developer tools provided in Scientific Linux 6 (e.g. gcc) have now become quite old. To gain access to newer versions you can now have various software collections added to your system. These are extra packages provided by Redhat, details are available on the computing help site.
As part of the work necessary to provide access to more of these software collections we have upgraded the devtoolset collection from version 2 to 3. If you are currently using this collection to get a newer version of gcc you must change your scripts after your system has applied updates overnight (Wednesday 7th to Thursday 8th January). The specific devtoolset name has changed so it will now be activated like:
scl enable devtoolset-3 bash
(note the change from devtoolset-2 to devtoolset-3). Apologies for forcing this incompatible change at short notice, we hope that the changes we’ve made will allow us to avoid this pain in the future. The benefit of this change is that gcc will be upgraded to version 4.9.1.
As usual, if you have any queries about this please contact the Computing Team via the support form.
We have now resolved the hardware problems with the new hardware for the staff NX server so we can reschedule the planned upgrade.
On Tuesday 6th January we plan to replace the staff NX server named central which hosts
staff.nx.inf.ed.ac.uk with a machine named northern.
All that will happen is that at about 09:00 on Tuesday we will change the DNS aliases to point to the new machine. This change can take some time to propagate so we will not switch off access to central immediately. It will be left running as normal until 12:00 Friday 9th January. This should allow sufficient time for users logged in to finish their existing sessions and move to the new server.
The IP address for the service will change from
22.214.171.124, your NX client may warn you about this change and request verification. For reference the new RSA host key fingerprint is:
More information regarding the NX service can be found on our help pages. If you encounter any problems accessing the NX service please contact us via the Support Form.
We have recently discovered a problem with the cron daemon (cronie) which is supplied with SL6 and SL7 which means that users with home directories stored in AFS have been prevented from using the service on DICE machines. We have now patched the code and are satisfied that normal service has now been resumed. This bug was introduced as part of the SL6.5 upgrade which occurred during September and October. This wasn’t spotted quickly because normally when a cron job fails the user will get an error report via email but due to the nature of the bug all user cron jobs were silently failing.
Apologies for any inconvenience caused, we have now put in place better monitoring of this service so we should catch any future problems a lot more quickly.
For those interested, this problem was introduced when a security hole was fixed. The change in question is recorded in the Redhat bugzilla as #697485. The change was to drop privileges (i.e. go from root to the user who owns the crontab) before reading the crontab which is clearly a sensible thing to do. A piece of code used elsewhere in cronie was reused to drop privileges, rather annoyingly it has an unnecessary secondary function which is that it insists on being able to change into the user’s home directory. With AFS the home directory is usually inaccessible (even to the user which owns the directory) as there are no Kerberos tickets or AFS tokens available at this stage in the session. There are later checks on the ability to access the home directory which can be worked around by setting the
HOME environment variable to a directory in the local filesystem but that doesn’t work in this case since it fails before the crontab has been parsed.