DICE desktop reboots

In order to maintain the integrity of our systems we need to reboot all DICE desktops over the next few days. We understand that reboots can be very inconvenient so you can be assured that we will only ever schedule reboots which we consider to be essential.

A delayed reboot has been scheduled for all DICE desktops. Student lab machines will be rebooted overnight. For office machines the delay will be 5 days. Although the reboots are delayed, it would be greatly appreciated if people could manually reboot their machines at their earliest convenience; the delayed reboot would then be cancelled.

If you have any queries they should be submitted via the Support Form.

Posted in Uncategorized | Leave a comment

OpenVPN changes

As described in my November posting, we’re making some changes to our OpenVPN configuration, partly to support new devices, partly to add capacity, and partly because one of the features we currently use is being phased out.

The server-end changes are now all in place, as are the new configuration files needed on the client end to use them.  Please fetch these by following the links from the computing.help site.  The new files should replace any existing configuration files you have on your machine.  You may then have to stop and restart OpenVPN to have these new versions picked up.

There are platform-specific versions for Windows, Mac, iOS and Android, and generic versions which should work for Linux and *BSD.  Please fetch and install platform-specific versions if possible, as these may contain additional settings to give smoother operation in some cases.

The current plan is to phase out the old endpoints and configurations around the end of March, when we’ll recover the IP address space for reuse elsewhere.  Old client configurations won’t work with the new endpoints without some adjustment.  Look out for further announcements and reminders nearer the time.

Please provide feedback through the support form in the usual way.

Posted in Uncategorized | Leave a comment

Networking in the Forum meeting rooms

When the Forum meeting rooms were set up, we configured the wired network in them to use the “office self-managed DHCP” subnet.  This seemed appropriate at the time, as most users were expected to be Forum occupants.  The ground-floor meeting areas, in contrast, were set up to use a separate “conferences” subnet, outside the Informatics perimeter filters for better isolation of any problems.

We now propose to reconfigure the meeting rooms to match the ground floor, as the rooms are now more used by non-Informatics people.  It will also give a more consistent experience, and may make the auditors slightly less unhappy.  The Appleton Tower meeting rooms were changed like this some time ago.

Wireless users will be completely unaffected by this.  The only visible effect will be that wired users in the meeting rooms will have to use OpenVPN or one of the login servers to access non-public Informatics resources (so matching what wireless users already do).

It’s planned to make the change first thing on Thursday of next week (22nd), before any of the scheduled meetings start.

Posted in Uncategorized | Leave a comment

Disruption to some services 6th January

On the evening of 6th January, one of the disks in the IBM storage array used to provide storage for KVM guests at the Forum decided to go faulty. As with other storage arrays, this should not have been a major issue. One of the hot spares should have been brought on-line and the array rebuilt using that drive. Unfortunately, the drive failing locked up the whole array such that it stopped responding on the SAN (block requests) and LAN (management interface). Fortunately we have the ability to power-cycle most server equipment from home, so the array was duly power-cycled and brought back into service, with one of the hot spares kicking into life and the array being rebuilt. A small number of KVM guest servers didn’t automatically recover from this so were rebooted first thing on the 7th January.

IBM asked for some logs to be sent to them for analysis, but apparently this didn’t show up the reason for the array lockup.

Posted in Uncategorized | Leave a comment

DICE Software Collections

As many of you are aware, the standard versions of various developer tools provided in Scientific Linux 6 (e.g. gcc) have now become quite old. To gain access to newer versions you can now have various software collections added to your system. These are extra packages provided by Redhat, details are available on the computing help site.

As part of the work necessary to provide access to more of these software collections we have upgraded the devtoolset collection from version 2 to 3. If you are currently using this collection to get a newer version of gcc you must change your scripts after your system has applied updates overnight (Wednesday 7th to Thursday 8th January). The specific devtoolset name has changed so it will now be activated like:

        scl enable devtoolset-3 bash

(note the change from devtoolset-2 to devtoolset-3). Apologies for forcing this incompatible change at short notice, we hope that the changes we’ve made will allow us to avoid this pain in the future. The benefit of this change is that gcc will be upgraded to version 4.9.1.

As usual, if you have any queries about this please contact the Computing Team via the support form.

Posted in Uncategorized | 2 Comments

New staff NX server (revisited)

We have now resolved the hardware problems with the new hardware for the staff NX server so we can reschedule the planned upgrade.

On Tuesday 6th January we plan to replace the staff NX server named central which hosts staff.nx.inf.ed.ac.uk with a machine named northern.

All that will happen is that at about 09:00 on Tuesday we will change the DNS aliases to point to the new machine. This change can take some time to propagate so we will not switch off access to central immediately. It will be left running as normal until 12:00 Friday 9th January. This should allow sufficient time for users logged in to finish their existing sessions and move to the new server.

The IP address for the service will change from 129.215.33.56 to 129.215.33.85, your NX client may warn you about this change and request verification. For reference the new RSA host key fingerprint is: c3:46:f4:e5:13:d2:cb:6c:df:a1:d9:24:79:68:15:d6

More information regarding the NX service can be found on our help pages. If you encounter any problems accessing the NX service please contact us via the Support Form.

Posted in Uncategorized | Leave a comment

cron problems

We have recently discovered a problem with the cron daemon (cronie) which is supplied with SL6 and SL7 which means that users with home directories stored in AFS have been prevented from using the service on DICE machines. We have now patched the code and are satisfied that normal service has now been resumed. This bug was introduced as part of the SL6.5 upgrade which occurred during September and October. This wasn’t spotted quickly because normally when a cron job fails the user will get an error report via email but due to the nature of the bug all user cron jobs were silently failing.

Apologies for any inconvenience caused, we have now put in place better monitoring of this service so we should catch any future problems a lot more quickly.

For those interested, this problem was introduced when a security hole was fixed. The change in question is recorded in the Redhat bugzilla as #697485. The change was to drop privileges (i.e. go from root to the user who owns the crontab) before reading the crontab which is clearly a sensible thing to do. A piece of code used elsewhere in cronie was reused to drop privileges, rather annoyingly it has an unnecessary secondary function which is that it insists on being able to change into the user’s home directory. With AFS the home directory is usually inaccessible (even to the user which owns the directory) as there are no Kerberos tickets or AFS tokens available at this stage in the session. There are later checks on the ability to access the home directory which can be worked around by setting the HOME environment variable to a directory in the local filesystem but that doesn’t work in this case since it fails before the crontab has been parsed.

Posted in Uncategorized | Leave a comment

Cloud survey results

Earlier this year we conducted a survey of VM/cloud usage. The project has now concluded and a formal report on the project can be found on the Rat Unit wiki This includes a link to the survey report itself. Copies of all the documentation relating to the project can be found in afs in the /afs/inf.ed.ac.uk/group/rat-unit/projects/vm_cloud_survey directory.

Posted in News, Project Reports | Leave a comment

OpenVPN changes

It’s now over ten years since we first set up our OpenVPN service, and things have moved on quite a bit since then.  So far we have managed to maintain compatibility for existing users of the service, but we would now like to make a couple of enhancements which unfortunately do require incompatible changes:

  1. We would like to offer a service to users of Android and iOS mobile devices.  However, the way we set up the service access keys (back at the beginning, when that was the only way to do it) is not compatible with the way these devices now require things to be done.
  2. Due to the ever-growing popularity of the service we need to expand the IP address space used so as to avoid unexpected glitches for users.

As this is necessarily an incompatible change, we’ll arrange things as follows:

  • We’ll set up new endpoints to provide the new service for testing. (This has actually already been done.)
  • We’ll create suitable new configuration files for beta-testers.
  • Once we’re happy that things are running as expected, we’ll tidy up, document and advertise the new configurations.
  • Some time later (probably around Easter next year) we’ll close down the old-style service.

We’ll also take the chance to remove some now-deprecated options from the configurations, and we’ll add some platform-specific enhancements where these appear to be generally useful.

We do have to turn off the old service in due course, rather than just leaving it running, as this will allow us to recycle the IP address range it uses.  Globally-routed IPv4 addresses are now a scarce resource, and we simply can’t justify keeping these for what will be an ever-decreasing number of users.

Look out for announcements regarding the introduction of the new service, specific mobile devices, and in particular the schedule for the retiral of the old service.

Meantime, if anyone would like to beta-test the new service, please get in touch through the support form in the usual way.

Posted in Uncategorized | Leave a comment

Seven.

You may remember that DICE Linux is not getting the major upgrade this summer (DICE teaching platform upgrade – postponed) that we optimistically forecast in February (Upgrade of DICE desktops to Scientific Linux 7). This post explains what’s been going on.

DICE Linux is based on Scientific Linux, which is based on Red Hat Enterprise Linux. To make a Linux distro into DICE, we port our configuration technology LCFG to the new system so that we can configure it appropriately. We use LCFG to, for instance, add software; make the network, printers and mail behave appropriately; control who can do what and where; and defend our systems and data against (constant) attack.

So why isn’t the latest greatest DICE ready yet? There have been two main problems.

The first was the later than expected release of RHEL 7. We had hoped for it to appear by February at the latest. Judging by previous releases this would have given the Scientific Linux team enough time to produce the corresponding SL release by April, which would have given us just about enough time to get LCFG and DICE ported and tested in time for the next session. Unfortunately RHEL 7 wasn’t released until June, and SL 7 was released this month (October).

The second major problem has been the sheer amount of new technology in RHEL 7. In a word, systemd. This ambitious replacement for init has introduced major changes to Linux. It abandons the old approach of starting services one at a time in a predetermined order, in favour of a dependency-based system. In principle this is a great idea, and it’s the approach taken by launchd, which does the same job rather successfully on Apple Macs. Some great advantages come with this approach – better control of processes for instance, and faster booting – but the scale of the changes has meant a great deal of work for us.

To cope with the changes some of our core software has had to be redesigned or replaced (rather than just recompiled and tested, as we would hope on a new system), and the required effort has been substantial. Read the SL7 LCFG port diary to get some idea of what we’ve been up against (and to).

Another way to grasp the enormity of the change from init to systemd is to look at the opposition it’s stirred up. In Linux as in life, when wide-ranging revolutionary change is imposed, there will be rebellion. In the case of systemd the perceived preference of the design team for unrelenting major change over (say) consolidation and bug fixing, and the project’s absorption of more and more formerly independent Linux services, hasn’t helped. Searching the web for systemd controversy throws up some interesting responses – Systemd: Harbinger of the Linux apocalypse, Boycott systemd, Debian fork and uselessd among them.

We think we can now cope with systemd, at least to the extent of being able to configure it to produce a basically working DICE system. Along the way we have also been tackling a number of other challenges, such as the not much less controversial GNOME 3, and the replacement of the grub bootloader with a very different grub 2, but nevertheless we hope to have a DICE SL7 desktop option available for “early adopter” staff and researchers within a month or two. Thanks to careful redesign of our software infrastructure we also hope to be able in time to offer DICE variants based on related flavours of Linux such as CentOS 7, RHEL 7 itself and Oracle Linux, if the demand is there. Are we still too optimistic? Time will tell.

Posted in Uncategorized | Leave a comment