Category Archives: DICE

alpine descent

Fans (and regular users) of the text-based email client “alpine” might be interested to know that as of 3rd December we will be reverting to the previous upstream release, 2.11 on DICE SL7. The noticeable effect of this change should be minimal, except if you are in the set of users affected by bugs in the 2.20 release, in which case it should be wholly beneficial. DICE SL6 won’t be affected by this, as it has remained on the comparatively stable 2.10 release for some time now.

For those who have not been following the development of pine/alpine in recent years, this history gives a good account.

The version we’ve been using for a little while (2.20-1) was taken from an alternative fork to previous versions and contained several flaws – some of which we’ve noticed in practice – and unfortunately the updated package (2.20-2) referenced above adds a further bug affecting normal workflow (specifically: entering the “Compose” menu causes a repeatable fault if keylabels are disabled; in due course this should be reported upstream).

Given the comparatively unstable state of upstream alpine packaging, we’ve taken the decision to revert to 2.11 for now, and in future we will almost certainly build and package our own release of alpine as we had done for most of the course of DICE SL5 and SL6, but based on Eduardo Chappa’s releases. This invites some additional maintenance load to using our upstream distribution, but should hopefully result in a better, more stable alpine on SL7.

a big hand for rfe

Here are some tools that I’ve put together that make a huge difference to a clunky rfe workflow. Bored already? Don’t worry, I’ve written the script already: tl;dr

Imagine power cycling a DICE server with redundant PSUs, using our lovely power bar control files. You don’t know where it’s installed, so you have to search for it. Continue reading

alpine, nagios and display filters

I’ve been aware of alpine’s “display filter” feature for some time, used as it is for on-the-fly GPG interpretation amongst other things. But I’d never really examined the feature before now. The manual says:

The [filter] command is executed and the message is piped into its standard input. The standard output of the command is read back by Alpine.

This says it all: display filters turn out to be an extremely powerful generic mechanism for reformatting and enhancing text; it works particularly well when applied to machine generated messages. Maybe its power is best explained by the example which caused me to investigate in in the first place:

An example (the nagios bit):

A longstanding irritant to me has been a my difficulty in shutting nagios up. For a long time I’ve been relying on a filter to parse nagios’ incoming emails and generate a URL. The display filter closes the loop, automatically injecting that magic URL at the end of the message.

Here’s a simplified version of the filter, reminiscent of the one in the previous post:


#!/usr/bin/gawk -f
# Crude detection of problem type for acknowledgement link
# Don't forget to validate these inputs...
/Notification Type: / { TYPE=$3; }
/Service:/ { SERVICE=substr($0,length($1)+1,length($0)); }
/Host:/ { HOST=$2; }
# Important: this is a filter, so don't forget to print input lines back out!
// {print;}
# Now add the acknowledgement link below:
END {
    if (HOST && TYPE == "PROBLEM") {
        # this is the script which generates the URL.
        # ideally this should be replaced with some awk to do the same thing
        cmd="~/bin/nagack "HOST" "SERVICE
        cmd | getline url
        close(cmd)
        # now add the link to the email.
        print "[Acknowledgement link: "url" ]"
}

Now, to alpine’s Display Filters setting, add:


Display Filters    = _LEADING("***** Nagios")_ /path/to/nagios-filter-script

that’s it! My emails from nagios now look like:


***** Nagios *****
Notification Type: PROBLEM
Service: ssh
Host: myhost
Address: 192.168.12.34
State: CRITICAL
...
[Acknowledgement link: https://nagiosserver/nagios/cgi-bin/cmd.cgi?cmd_typ=3... ]

Important caveats:

  • If you’re not careful, by adding these filters you will have introduced a trivial local shell injection attack to your mail client. Validate your inputs — just like I didn’t above!
  • The developers have this to note about running filters on every message:

    Testing for the trigger and invoking the filter doesn’t come for free. There is overhead associated with searching for the trigger string, testing for the filter’s existence and actually piping the text through the filter. The impact can be reduced if the Trigger Modifying Tokens […] are employed.

    I’ve certainly noticed a small (immeasurable, but noticeable) delay in opening messages with triggers. Large enough to be annoying if I’d planned to filter every message, even using a trivial bash filter which itself is quick to complete.

  • One additional caveat on DICE: if your alpine session outlives your AFS credentials, and you’ve stored your display filters in your home directory, you will find that the display filters simply disappear. As good a reminder as any to renew, and thankfully a “renc” is all that’s required to restore your filters to former glory.

That’s it! Surprisingly trivial, and with a handful of these triggers, the benefits are huge. I’m using five so far, mostly generating clickable links to some of our automated systems, but I’d be pleased to hear what other people are doing with these filters.

Editing component files with vim

Editing LCFG component source files using Vim is of course The Right Thing to do, but due to the way these source files are named (typically filename.ext.cin) vim doesn’t necessarily pick up on the filetype, and goodies such as syntax highlighting are lost.

This is easy to fix using vim’s ftdetect system. Some examples for simple types:

" These files are always POD in disguise
au BufRead,BufNewFile *.pod.cin : set filetype=pod
" Slightly contentious: a new filetype is needed, really, but this is a decent match.
au BufRead,BufNewFile *.def.cin : set filetype=cpp
" For other, unknown types, detect from the as-yet undefined shebang:
au BufRead,BufNewFile *.cin : if getline(1) =~ '^#!@SHELL@' | set filetype=sh | endif
au BufRead,BufNewFile *.cin : if getline(1) =~ '^#!@PERL@' | set filetype=perl | endif

(note the latter two lines are specified separately, rather than elseifed, purely for readability). It’s fairly obvious that this can be extended to any file type, and there’s also scope for adding an automatic mapping to allow all files of form file.typ.cin to be mapped automatically to their default .typ. “sub-extension” file type.

Anyway, the above has already improved my productivity no end so I’ll leave the latter exercise to the reader. Comments and contributions are welcome, as always — so long as they’re not suggestions to use Emacs(!)

Away with the PXEs

Occasionally, for the purposes of internal testing or continuity, it’s desirable to bring up a server with a duplicate MAC address. It’s a safe enough manoeuvre (so long as these machines operate on different wires) for the brief periods in which I require it but when this scenario involves the installation of a new server via our installroot PXE service, things are trickier.

Our PXE server is configured automagically by spanning map and, effectively, keyed on MAC, so it’s unlikely to present the correct configuration (reliably) when the new host differs from the old one in some way.

The workaround is to override the PXE configuration on the *existing* server (on the basis that you weren’t planning on reinstalling it, anyway, were you?):

!pxeclient.platforms mADD(new_plat_name) /* e.g. sl6_64 */

/* And, if you need to add or remove serial console support: */
!pxeclient.serial_port mSET(ttyS0) /* or () */

Post-PXE, the dhclient component is aware of subnet differences and will ensure your machine receives the correct profile for installation (though, to prevent future confusion, remove this as soon as the installer has done its work!).

get on the rpm bus

This is a quickie script which streamlines my RPM building and submission to a single command. Note that this is entirely dependent on our shiny new Package Forge system, which feeds RPMs to multiple platforms for building and eventual submission into our RPM buckets.

All it does is chain up “rpmbuild -bs [spec]; pkgforge submit [srpm]” but it’s a nice timesaver nonetheless. Side-benefits include the automatic generation of a readable ID and provision of a tracking link for pkgforge so that you can anxiously refresh the page to watch the build progress (or you could just wait for the report email…).

So, here is is; my very simple and stupid RPM automation. Suggested name: ‘rpmbus’.

#!/bin/bash
if [[ -z $2 ]]; then
    echo "RPMbus: build -> submit assist"
    echo "Usage: `basename $0`   [pkgforge args]"
    exit 1
fi
bucket=$1; shift
spec=$1; shift
args=$*

output=`rpmbuild -bs ${spec} | tail -n 1`
pkg=`echo ${output} | sed -e 's_^Wrote: __'`

if [[ ! -e ${pkg} ]]; then
    echo "Package wasn't built: ${output}"
    exit 1
fi

id=`basename ${spec} | sed -e 's_\.spec__' -e 's_\.__g'`-`date +"%s"`
echo -e "Found source package:\n  ${pkg}"
echo "  Extra args: ${args:-none}"

read -p "Submit to '${bucket}'?" foo
if [[ ${foo} != 'y' ]]; then
    echo "Cancelled"
    exit 1
fi

echo "Submitting to ${bucket}..."
pkgforge submit --id ${id} -B ${bucket} ${args} ${pkg} && \
echo "  https://pkgforge.inf.ed.ac.uk/job/view?id=${id}"

 

Caveats: well, they’re numerous and they’re pretty apparent. But it took five minutes to write and it WFM :)

Nag nag nag nag nagios

Nagios is an extremely useful tool, until it isn’t.  Which is to say, it’s nothing but a hindrance to have nagios continue to bombard you with IMs and emails when you’re already working on the problem.

Surely you can just acknowledge the fault and shut it up…?

Well, sometimes, but it is hardly convenient to break out a Firefox session when you’re attached to a serial console with your lovely secure-shell-enabled phone.  And even if you are on a DICE machine it’s a bit of a pain to have to navigate the slightly clunky Nagios UI to find the host and service you wish to silence.

I started with a dumb bash script. Continue reading

homing pidgin

Just another bit of shell glue which took about twenty minutes but yielded lovely results as it occurred to me that the DICE-wide inventory tools can now locate (at least in theory) any machine.

Being able to find out what office I’m in is more useful a feature than you might think.  Primary of those is the ability to advertise my whereabouts to colleagues, for example on entering a server room, in case I can be of button-pushing assistance to others.  Whenever I move around, I make an effort to update my Jabber status to point this out.

In fact, the glue was very straightforward and I learned about a particularly useful new tool: the Python DBUS libraries.  DBUS is the message bus adopted by most modern “freedesktop”-compatible environments, and the Python library provides a quick and easy way onto the bus.

First, I hacked together a tiny script to establish where I am. Continue reading

tar-a-dice?

I recently encountered a .tar file which DICE tar wouldn’t extract, giving multiple warnings:

tar:  Ignoring unknown extended header keyword `SCHILY.[...]'

before bailing out due to previous “errors”. The file in question appeared to have been created on a Mac, whose archive utility adds extended metadata to its files.


This is confirmed and fixed in May 2007, but some more venerable Linux distributions continue to ship older versions.  Tar’s behaviour is a bug, since these warnings should be non-fatal.  However, the warnings can be avoided altogether by addition of an argument:

 --pax-option="delete=KEYWORD.*"

for each problematic keyword.

[Source: Tar mailing list]

Updating Matlab Licences

This is simpler than it used to be, but is a bit of a convoluted process, so I thought it worth detailing.

When our Matlab licences are renewed or added to by request, Informatics will receive a new licence file (apparently known as a “license file” to those living west of us). This will look something like:

# LicenseNo: 123456   HostID: 000000001122
INCREMENT MATLAB MLM 20 01-jan-[...] \
        VENDOR_STRING=[....] \
        DUP_GROUP=[...]
INCREMENT GADS_Toolbox MLM 20 01-jan- [...] \
        VENDOR_STRING=[...] \
        DUP_GROUP=[...]
[...]

Note the absence of SERVER and VENDOR lines: other FlexLM vendors provide these lines, but I’ll talk about this later. It’s worth noting that if either line is present in a Matlab licence file, it should be removed.

This file needs to be packed into our matlab-license rpm. Retrieve the latest SRPM (be careful, sometimes the dates are more instructive than the versions!) and install it to your RPM build directory. You’ll find a tarball along the lines of matlab-license.tgz — this has a directory structure resembling:

matlab-license/
matlab-license/roles/
matlab-license/roles/matlabclassroom/
matlab-license/roles/matlabclassroom/env.455.matlab
matlab-license/etc/
matlab-license/etc/teaching/
matlab-license/etc/teaching/license.dat
matlab-license/etc/research/
matlab-license/etc/research/license.dat
matlab-license/etc/research/other.dat
matlab-license/CoreFiles/
matlab-license/CoreFiles/env.450.matlab

Note that the licences are built from multiple files in each of the teaching/ and research/ directories. Let’s say we’re replacing the research licence, since this file is the more complex of the two.

$ cd /tmp/
$ tar -zxf ~/RPM/SOURCES/matlab-license.tgz
$ less matlab-license/etc/research/license.dat
# MATLAB license passcode file.
#
# This portion contains the main set of research licences
# LicenseNo: 123456   HostID: 000000001122
INCREMENT MATLAB MLM 20 01-jan-[...] \
        VENDOR_STRING=[....] \
        DUP_GROUP=[...]
INCREMENT GADS_Toolbox MLM 20 01-jan- [...] \
        VENDOR_STRING=[...] \
        DUP_GROUP=[...]
[...]
$ less matlab-license/etc/research/other.dat
#
# This portion contains Some Other licences
# LicenseNo: 123456   HostID: 000000001122
INCREMENT MATLAB MLM 21 01-jan-0000 11 3D20000[...] \
[...]

It’s important to note that these files might have come from different accounts and it is likely that we are only replacing one of the files — possibly none if this is a brand-new addition. If we’re replacing just the main Research licence, we need to leave the other files alone; if we’re adding a new licence, simply add a file to the research directory with any name you wish (so long as it ends .dat).

Paste your replacement licence INCREMENT lines into this file. Tgz the file back up and place it in SOURCES.

Once your new tarball is back in sources, you need to amend the specfile. Update any matlab versions amongst the %defines at the top of the file, and increment or reset the Release as appropriate. Build and submit as normal:

$ rpmbuild -bs SPECS/matlab-license.spec
$ pkgforge submit -B inf -p sl5,sl6 SRPMS/matlab-license-x.y-z.noarch.rpm

(Note that this will submit new matlab-license-teaching, matlab-license-research and other RPMs. If there are no major version changes and we’re simply modifying the research licence, it’s OK just to deploy the matlab-license-research RPM, but it’s good practice to submit them all, and certainly the SRPM.)

Continuing our assumption that we’ve just made a trivial change to the research licence (a new INCREMENT line for example), edit the dice/options/matlab-research-server.h and live/matlab-research-server.h headers to include the new RPM:

$ svn di core/include/dice/options/matlab-research-server.h
+!profile.packages               mEXTRA(+matlab-license-research-7.7_7.8-7.inf/noarch)

$ svn di live/include/live/matlab-research-server.h
+/* REMINDER: Remove this next Friday! */
+!profile.packages               mEXTRA(+matlab-license-research-7.7_7.8-7.inf/noarch)
[...]

Finally, if we need to lock down one of our new licences to a specific user, we can do this using the live header and the FlexLM component’s options-file support:

$svn di live/include/live/matlab-research-server.h
[...]
+/* RTxxxx: Reserve one Statistics Toolbox licence */
+!flexlm.options_research        mADD(gr1 re1)
+flexlm.option_research_gr1      USER_GROUP gr1 userone usertwo userthree
+flexlm.option_research_re1      RESERVE 1 "Statistics_Toolbox" USER_GROUP gr1

Clearly, this reserves one Statistics Toolbox licence for the given group of users.

Having made all these changes, commit to the repository. You can push the required changes to the FlexLM server without restarting: the FlexLM component will perform a reload on picking up the options-file changes, though these may not make sense until updaterpms is run so the recommended intervention is as follows:

[matlab-research]$ om updaterpms run
[INFO] updaterpms: 1 installs, 0 removals
[INFO] updaterpms: Flagging matlab-license-research-7.7_7.8-7.inf/noarch for upgrading to
[WAIT] updaterpms: Checking dependencies
[INFO] updaterpms: (1/1) upgrading to matlab-license-research-7.7_7.8-7.inf/noarch

[matlab-research]$ om flexlm configure
om flexlm configure
[INFO] flexlm: Reconfiguring /usr/lib/lcfg/conf/flexlm/research
[INFO] flexlm: Reloading flexlm configuration for research (could take a minute)...
[OK] flexlm: configure

You may check the result of your actions by examining the Flexlm web-status page, which for the Research matlab licence is http://matlab-research:1881/usage-research.html. This should detail your new licences in the upper section, and any reservations alongside the list of ‘in-use’ licences in the lower section.

Be aware: if you make a mistake with INCLUDE, EXCLUDE and RESERVE lines and have to change them, reservations can end up becoming locked. A restart of the FlexLM server will clear this but it’s worth knowing in case you’re wondering where a licence went!

For more information on configuring FlexLM, see the relevant RAT Unit documentation pages for FlexLM and Matlab, and also the lcfg-flexlm manual page (the definitive reference for configuring a lcfg-flexlm server).

UPDATED 15/03/2012 to reflect the updated RPM procedure where licences are automatically constructed from their constituent parts in etc/{research,teaching}.

monitoring postgresql

This should serve as yet another account of “naïve” addition of monitoring to an LCFG component.  It proved to be surprisingly straightforward: PostgreSQL on my test server is now being monitored; built-in support is now available on any host:

#include <dice/options/postgresql-monitoring.h>

though take care to follow the instructions within.

Continue reading