lcfg-x509 and letsencrypt

Automated acquisition of X509 certificates is an important part of our infrastructure. We obtain certificates of two different types, both provided through services offered by the University:

  1. Certificates provided via JISC, which are signed by a CA trust chain whose root certificate is installed in most OSes and browsers by default. These are obtained manually, by submitting a CSR through a web form.
  2. Certificates signed by a trust chain whose root is the University’s CA certificate. This is an automated process, using the lcfg-x509 component and our locally written sixkts software. We can automate the signing process as we have an intermediate CA certificate, signed by the University’s root.

It is becoming increasingly impractical and unrealistic to expect users to install the University’s root CA in their devices, so most of our public facing web sites are now using certificates of type 1. above. We continue to use certificates of type 2. for other purposes, where we can manage the trust chain, e.g. for backend communications for Cosign and for TLS LDAP.

The big problem that this gives us is automation, or rather the lack of it. Pasting CSRs into web forms, waiting for an email, extracting certificates from that email, copying them to the appropriate machine, changing ownership and permissions, restarting appropriate services… it’s tedious and error-prone.

All of this brings us to Let’s Encrypt. It’s in public beta at the moment, but offers us an automated way of obtaining trusted certificates.

We already have an LCFG component – lcfg-x509 – which obtains certificates (using our sixkts technology) and then does all the administrative work necessary to put them into service. As a proof of concept I added support for letsencrypt to a development version of this component…

I decided to use the client – the official client seemed a little complex and heavyweight to me, with a hefty array of python dependencies. seemed more lightweight and better suited to our uses.

The development version of lcfg-x509 supporting letsencrypt and its corresponding schema can be found here:

An RPM for can be found here:

(SL7 versions of all the above are also available, but I’ve only tested with SL6)

Here’s some example LCFG which modifies an existing lcfg-x509 certificate, for the machine “oakwood”, to use letsencrypt:

!profile.packages mEXTRA(+letsencrypt-sh-0.0.3-1.el6/noarch)
!profile.packages mEXTRA(+lcfg-x509-0.0.31_dev-15/noarch)

!profile.version_x509  mSET(4)

!x509.le_client_path mSET(/usr/sbin/
!x509.le_wellknown mSET(/var/www/html/.well-known/acme-challenge/)
!x509.catype_oakwood mSET(letsencrypt)

This requires the following chunk of httpd config in order to respond to the http-01 challenge:

<IfModule mod_headers.c>
  <LocationMatch "/.well-known/acme-challenge/*">
    Header set Content-Type "text/plain"

Some caveats:

  • This is an early developmental release, just to see if it works.
  • It lacks useful documentation. The template for’s config file – /usr/lib/lcfg/conf/x509/ is informative, as is the github page for
  • The lcfg-x509 component is quite old and is definitely showing its age. It should probably be rewritten.
  • … We may decide to decouple letsencrypt from lcfg-x509 – supports “hooks” which can be used to provide your own code to be called at various stages of the process (specifically “clean_challenge”, “deploy_challenge”, or “deploy_cert”). This is potentially very powerful in automating challenge responses.
  • lcfg-x509 currently makes no efforts to tidy up old certificates obtained via letsencrypt
  • … or anything else it uses for letsencrypt
  • Responding to the ACME challenges could be the trickiest part in all of this – the above config relies on you having an apache server already listening on port 80. The dns-01 challenge looks promising, but requires an API to add and remove DNS records.
  • The rate-limiting for letsencrypt is quite strict – currently limited to 5 per domain per week. This doesn’t sound too bad until you realise that domain == public suffix + domain, which means everything under “” counts as the same domain. It’s probably best to use the staging URL for testing.
Posted in Uncategorized | Comments Off on lcfg-x509 and letsencrypt

The sssd component and sub-classing

When implementing sssd on SL7 we had to decide how to manage the configuration file (/etc/sssd/sssd.conf) used by sssd. Starting and stopping of the daemon itself is done by systemd. We considered a custom component, but sssd.conf uses an INI-file syntax and LCFG already has lcfg-inifile for just such configuration. Any custom component would be largely duplicating the work done by lcfg-inifile.

It is also quite difficult to write a sensible specific component for something like sssd, where the number of options are many and varied (see sssd.conf(5) and the additional man pages referenced at the end). The danger is that you implement specific resources for the features you personally require and add more and more to this as time passes
(see lcfg-openldap for how this can end up looking). It was thought, in this case, that a more generic approach was better suited.

Intitially we used lcfg-inifile (with a few local modifications, since implemented upstream). What this doesn’t give us, however, is namespace separation (i.e. it would be easy to break sssd when configuring inifile resources for another purpose). We decided to look into sub-classing the lcfg-inifile component in order to use its functionality, and at the same time adding anything specific to sssd on top.

This is surprisingly easy to do, providing the component is written in perl and the component code is delivered in a perl module.

To inherit the resources from the parent component, we added the following to sssd.def:

#include "mutate.h"
#include "inifile-2.def"

!schema mSET(@LCFG_SCHEMA@)

This means that the sssd component supports all the resources provided by lcfg-inifile.

We also add the following resources:

!files mSET(sssd)

file_sssd /etc/sssd/sssd.conf
owner_sssd root
group_sssd root
mode_sssd 0600
purge_sssd no

useservice_sssd yes
onchange_sssd sssd restart

This essentially hard-codes the configuration file, ownership, permissions and behaviour when the file changes (all of this can of course be overridden).

It would have been nice to have been able to mandate that the sssd.conf file has an [sssd] section, with something like this in sssd.def:

!sssd_sections mSET(sssd)

This doesn’t work as planned, however, as this is a default value and in LCFG these are only applied if the value is not set in any other place. This would mean that any subsequent mutation (e.g. an mADD of other sections) would lead to the default value never being used. Header files will be used for any such initial configuration.

Other than the schema, the other aspect of sub-classing is the code itself. This is implemented using standard perl sub-classing, e.g. For sssd we have no additional code so LCFG::Component::Sssd simply looks like this:

package LCFG::Component::Sssd; # -*- perl -*-
use strict;
use warnings;

use v5.10;


use base qw(LCFG::Component::Inifile);


If we require custom code, then we could add our own methods (calling the parent class’s method(s) as required).

One thing which would be nice to be able to add for a sub-classed component is additional validation for some resources. This is quite difficult in this case, as we can’t mandate what resources are named, particularly when taglists are involved. It would probably require a custom, non sub-classed component to be written, with all that this entails. My personal opinion is that a syntax-checking tool [1] is a better approach to a complex system such as sssd.


Posted in LCFG | Comments Off on The sssd component and sub-classing

openldap client and sssd

For the port of LCFG to SL7, we have been thinking about our LDAP client provision. Historically we have run a full slapd server on all clients, replicating hourly from the master. This was largely for reasons of stability and also the ability to support disconnected operation.

Our OpenLDAP: DICE client configuration project page contains a lot of information on our thoughts and discussions on this matter.

For our initial configurations of SL7, we’ve been investigating using sssd as a connection/caching daemon (the caching functionality in particular has made moving to sssd from nslcd an attractive option).

Our openldap LCFG component currently configures and manages both server and client side operation. The latter is much simpler as it runs no daemon and just populates the files
/etc/openldap/ldap.conf, /etc/ldap.conf and optionally /etc/nslcd.conf. Separating the component into individual client and server openldap components is something we are considering.

We have decided to configure sssd separately from the openldap component (in contrast to nslcd, which was configured (and partially managed) from lcfg-openldap). This will help to reduce the complexity of the component and also make it easier to manage sssd properly.

To this end we have modified the openldap.nss_package resource so that it now accepts the value “none”, in addition to the existing “nss_ldap” and “nss-pam-ldapd” values. Setting it to “none” will result in only /etc/openldap/ldap.conf, of the three client configuration files mentioned above, being configured.

For configuration and management of sssd, we need to decide whether to write a new component, or use an existing one. We may well decide on the former in the longer term, but for our initial testing we have made use of the lcfg-inifile component, as sssd’s configuration is in inifile format. We have made a few local modifications to this
component – firstly to add support for setting user and group ownership and permissions; secondly, we have added support for using the new Service() function in ngeneric to restart sssd when its configuration changes. The component we use for sssd will be under ongoing consideration.

We are using the version of sssd as provided by SL7, but patched to fix a bug when using DNS service discovery.

All of this is in testing and seems to work. The component versions are lcfg-openldap-3.1.71-1 and lcfg-inifile-1.0.2_dev-4.

It should also be noted that the shadow of systemd looms long over any attempts to configure component boot-time dependencies. We need to be very careful to get this right.

Informatics computing staff wishing to use this on their test SL7 machines can do so with:

#include <inf/options/dicehacks.h>
Posted in Uncategorized | Comments Off on openldap client and sssd

Changes to kdcregister

In Informatics we have, for years, used the kdcregister program to obtain keytabs for host-based kerberos principals from the KDC. An important part of this usage is to obtain keytabs when a machine is being installed.

This has historically worked through the lcfg-kerberos LCFG component asking the user to enter their admin principal and password as one of the final stages of the installation process. This creates a ‘hostclient’ principal and extracts it to a keytab, which in turn is used to create other principals and extract them to keytabs, all according to the kerberos component resources.

A couple of factors have arisen that meant we required some new functionality from kdcregister. Firstly, colleagues in IS wished to run kdcregister at a much earlier stage of the installation process. As this is one of few parts that requires interactive input, it is preferable to do this early, and leave the machine to complete the installation, rather than require user interaction both to start and complete the installation. Secondly, the introduction of SL7 means a move to systemd, which didn’t seem to play nicely with the kerberos component requiring input at boot time (this could probably be worked
around, if needed).

One of the issues with running kdcregister early in the installation process is the requirement for a krb5.conf file with appropriate realm configuration – this provides details on how to communicate with the realm’s kadmin server. Having to provide such a file is obviously not ideal, so two new arguments have been added to kdcregister – these
allow the realm (-r) and server (-s) to be supplied. Note that the krb5 libraries still seem to require a krb5.conf file to exist, but this can be the example one shipped with kerberos.

To assist in scripted running of kdcregister, the -a option has been added. This results in kdcregister asking for a principal to authenticate with (this overrides the -p option if it’s also provided). One issue with specifying the principal to authenticate with is that getting this part wrong will result in kdcregister failing – this could present a problem during the installation process, but it would perhaps be more appropriate to wrap kdcregister in a loop, rather than put such functionality in kdcregister itself.

With the new options, an example of running kdcregister during the installation would be something like:

$ kdcregister -f -a -t /root/etc/krb5.keytab -r INF.ED.AC.UK \
  -s hostclient/$(hostname)

The version of kdcregister with these changes is kdcregister-1.17.0-1

Informatics computing staff wishing to use this on their test SL7 machines can do so with:

#include <inf/options/dicehacks.h>
Posted in Uncategorized | Tagged , , , | Comments Off on Changes to kdcregister

Looking at kerberos authentication in iOS7

Our recent survey into areas where we could improve mobile support for our services indicated “authentication” as such an area, specifically “allowing mobile users to use authentication mechanisms such as kerberos to access secure School services”.

I investigated how we could improve authentication for Apple iOS devices, specifically concentrating on iOS 7, as it introduced a new feature called “Enterprise Single-Sign-On (SSO)”. This essentially adds client support for SPNEGO/Kerberos web authentication, so that web sites protected by, e.g. HTTP-Negotiate, will not repeatedly prompt users for authentication.

This works on iOS by first manually-creating an XML configuration profile, as described here and here. This profile should then be installed on an iOS mobile device. I created a similar profile for Informatics and installed it on an iPad. This does appear to work as intended – I was prompted only once for a username and password on visiting SPNEGO-protected sites.

In Informatics, we use SPNEGO in our Cosign infrastructure so that users who have already authenticated to kerberos will not have to authenticate again to access cosign-protected web sites. Users who have no kerberos credentials (or who haven’t configured their web-browsers) have to authenticate once by entering their username and password into a web form. This means we already have single (or reduced) sign-on for authenticated web access. The new SSO support in iOS doesn’t really help us in any meaningful way – the only benefit it would offer is that the user, on visiting, will be prompted for a username/password with an iOS7 native dialog box, rather than authenticating via a web-form. This isn’t worth the complexity of installing a configuration profile on a mobile device.

I also looked into whether any of the various iOS SSH clients supported kerberos (gssapi) authentication. None of them appeared to do so and none of the authors that I contacted had plans to add this support. It is debatable whether such support would be seen as being of much benefit to end users – the real increase in usability from gssapi support in ssh is not having to provide a password with each ssh connection. Most iOS ssh clients provide a facility to remember a user’s password anyway, so the end-result is the same, even if the underlying process is different (and inherently less secure).

Posted in Kerberos | Comments Off on Looking at kerberos authentication in iOS7

openldap proxycaching project update

In December last year, I talked a bit about the issues affecting the OpenLDAP proxycaching project. This posting updates the current situation.

Testing and Stability

As a result of problems with slapd crashing, we turned on debugging in the LCFG openldap component in mid December so core files would be generated for each crash, giving us an opportunity to debug the problem(s). We have now accumulated 18. This compares with approximately 4 crashes in other labs not running proxycaching, so clearly the proxycaching clients are less stable than our standard DICE clients (which use our own slaprepl technology to sync with the master). Unfortunately the core files do not point to a single problem, or bug – they are almost all different, with no identical backtraces appearing more than twice. There is a suspicion that memory corruption may be occuring in some cases. Also, it might be beneficial to build using Heimdal kerberos rather than MIT to see if that has any effect, as some core files appear to point in this direction. However, it is highly debatable whether continuing to debug these core files has any real benefit given (a) the time needed to do so and (b) the OpenLDAP developers now focusing on version 2.4, leading on to….

2.3 vs 2.4

It was thought, at the December development meeting, that sticking with version 2.3.43 of OpenLDAP would be the best approach for the moment. However, the lack of stability, as outlined above, means that we would not be happy implementing this solution. A previous drawback of 2.4 – as described in ITS#5756 has now been fixed, so making 2.4 a more attractive candidate.

Following some discussion, we now believe that the best approach is to increase testing of proxycaching using OpenLDAP 2.4 (current version is 2.4.13, with 2.4.14 imminent) to see if it suffers the same stability problems as found in 2.3. This will inevitably delay the project, but increasingly seems the right course to take. The urgent need for this project, as shown a couple of years ago when typically 20 machines a day would suffer ldap replication failures, has disappeared (most likely because of increased memory in commodity hardware). It would probably be sensible to either put the project into a stalled state, or set a deadline a few months away, to properly evaluate 2.4.

Posted in LDAP | Comments Off on openldap proxycaching project update

Serving AFS space using Apache and mod_waklog

As part of the Informatics move to using AFS, Roger and I have been
investigating how to serve AFS files using Apache.

The primary technical consideration is that apache needs to be able to authenticate against our KDC, needs to be able to obtain AFS tokens for the principal it’s authenticated as and AFS acls need to be configured accordingly to provide access to the underlying files.

Firstly we created a Kerberos service principal of the form
afsweb/ ( is the host on which we’re doing our initial testing). This principal maps automatically onto the AFS id (once it has been created in the pts database). The intention is that each machine which requires web access to AFS space should have one of these service principals. They can then easily be combined into AFS system group or groups (e.g. system:afswebservers).

Our initial testing focused on using k5start with pagsh when starting apache itself. This approach meant that k5start was entirely responsible for obtaining the kerberos tickets and afs tokens and keeping them up to date. Apache, being launched from the same pagsh as k5start, then had access to the tickets/tokens and could read appropriately configured areas of AFS file space.

The ACL configuration necessary was to add rl permissions for the AFS id on the directory containing the files to be served and l on all its parents (see below for further
discussion on this in the context of user-authenticated access).

This approach worked well, but had some obvious drawbacks:

  • Starting k5start and apache from within a pagsh requires hacky changes to the apache and/or apacheconf LCFG components. The idea of developing a separate LCFG component purely to manage tokens using k5start was mooted (and may still happen) but the issue of PAGs complicates this – i.e. apache and k5start have to be started from within the same pagsh or httpd has no access to the tokens obtained.
  • We have no granularity – this approach allows us to use one AFS id for the entire website, but nothing finer.

So, we moved onto looking at mod_waklog to see if it offered a better solution…. it certainly seems to. We were very quickly able to
provide a simple solution accessing files as the afsweb id using only
apache directives. Having installed an RPM of mod_waklog and adding the appropriate LoadModule directive, this is what we added to the httpd.conf


WaklogEnabled           On
WaklogDefaultPrincipal afsweb/ /etc/afsweb.keytab

This all worked perfectly, with apache able to serve files where ACLs
for permitted.

One thing worth noting here is that it’s necessary for the keytab to
be readable by the user that apache runs as. We initially had the
keytab permissions/ownership “600 root root” – this worked when apache starts but when it needs to eventually renew tickets, it has long since dropped root privileges so can’t read the keytab. We changed to “600 apache apache” – it would be advisable to check the implications of this with respect to, e.g. user scripts.

Another thing that mod_waklog gives us that might be really useful is the ability to use different credentials for different locations using
the WaklogLocationPrincipal directive. We need to investigate this

We can also use mod_waklog to authenticate on a user basis, by tying it in with cosign. Here’s an example segment of httpd.conf:

<Location /userweb>
    CosignProtected On
    AuthType Cosign
    Require valid-user

    CosignGetKerberosTickets On
    CosignKerberosSetupGss On
    WaklogUseUserTokens On

This also seems to work smoothly – visitors to the /userweb location will be authenticated via cosign and corresponding AFS tokens obtained on their behalf. They will then be subjected to AFS ACL constraints as the user they are authenticated as – that user must have permission to access the file(s) being served, i.e. must have permission all the way from the root of the filesystem. For apache to obtain the user’s kerberos ticket from cosign, the cosign servers must be configured to allow the web server to request kerberos tickets, e.g. something like the following line in /etc/cosign.conf:

service T

It is important to note here that, due to the way in which apache
works, the principal specified as WaklogDefaultPrincipal will also
need rl ACLs on any directory from which files will be served and l ACLs on all parent directories. Other ACLs can be used however – what’s important is that the identity that apache is running as (in our example, afsweb/ can read the files being served. So, ACLs for system:authuser, for example, could also be used.

Posted in AFS, Apache, Cosign | Comments Off on Serving AFS space using Apache and mod_waklog

proxycaching project issues

These are the current main issues in the project for implementing a proxycaching OpenLDAP solution on our clients. The current focus is on increased testing, debugging crashes and contemplating 2.3 vs 2.4.

Testing and stability

We’re currently running proxycaching clients on 122 configured machines, split between student labs (63 hosts) and DICE SL5 machines (59 hosts) following the develop release.

The lab machines have had condor running on them for a short while now. This seems to have increased the instances of slapd crashing – this was a minor problem which I’d seen on a handful of previous occasions. So far there has been no obvious pattern or likely cause – slapd just ceases to exist with nothing interesting being logged (on a few occasions it has occurred when entries are removed from the cache, but this isn’t consistent). This needs to be investigated. Turning on debugging for the lcfg-openldap component will mean we get core files when slapd crashes, so should be considered.

2.3 vs 2.4

Most of our testing to date with proxycaching has been done with OpenLDAP 2.3, although I have done some testing with each new 2.4 release as it’s come out. The 2.4 branch is now the current recommended release. So, do we stay with 2.3 or move to 2.4? In time-honoured tradition, here’s a list of pros and cons.

2.3 (current: 2.3.43)


  • We know it to be relatively stable and tested – both for proxycaching and general use – all our client and servers are running 2.3


  • No new features – bug fixes only (not sure for how long, but it’s clear the focus of the developers is almost entirely on 2.4)
  • We can’t specify template/attributes filters with “*”, as we can in 2.4 (see below).

2.4 (current: 2.4.13)


  • All new openldap development is on 2.4
  • 2.4 supports “*” attribute lists – this is extremely useful – on 2.3 you can’t cache lookups which don’t specify any attributes (an implicit request for “*” attrs). For a recent week on the lab machines, 99% of queries which couldn’t be cached under 2.3 would be able to under 2.4 using “*” attrsets.


  • Doubts over stability and testing of 2.4 proxycache code – are many other people using it? Up until 2.4.12 there was a bug in slapo-pcache which caused slapd to crash extremely frequently. Bugs like this have always been fixed promptly, but perhaps it would be wiser to wait until 2.4 matures further?
  • 2.4.12+ forces a change in the version of bdb used for the database backend. 2.4.12 requires bdb 4.4+ as a configure prerequisite. Testing indicates that 2.4.12+ with either 4.7.25 (+ 2 patches, one from sleepycat, one shipped with openldap) or 4.6.21 (+ 3 sleepycat patches) works OK so far. We’ve used 4.2.52 (+patches) for many years now so would have concerns over changing this without thorough testing.
  • ITS#5756 is a big stumbling block. This vastly reduces the effectiveness of proxycaching.

dns round-robin issues

We’d like to use the dir.inf dns round-robin as a simple means of load-balancing – currently this points to 4 LDAP servers, distributed across our server rooms. Unfortunately this won’t work because of the issues debated (at length) here. Essentially the problem is that getaddrinfo() sorts the addresses it gets from a DNS server according to RFC 3484. This doesn’t make sense for IPv4 (all our DICE machines were always getting the same server – the one which is furthest away).

This sorting behaviour can be overridden in /etc/gai.conf but before widely distributing a version of this file we would want to carefully consider any implications.

Posted in LDAP | Comments Off on proxycaching project issues

slapo-pcache and attribute lists

It’s worth noting this down as I don’t think it’s documented by openldap and every now and then it confuses me.

It’s easiest to illustrate with an example…

proxyattrset 0 uid gidNumber
proxyattrset 1 memberUid gidNumber
proxytemplate (&(objectClass=)(uid=)) 0 600 600
proxytemplate (&(objectClass=)(cn=)) 1 600 600

When setting up the templates and attrsets like this, you might expect all of the following queries to be cacheable:

ldapsearch -x "(&(objectClass=posixAccount)(uid=toby))" uid gidNumber
ldapsearch -x "(&(objectClass=posixAccount)(uid=toby))" uid
ldapsearch -x "(&(objectClass=posixAccount)(uid=toby))" gidNumber

ldapsearch -x "(&(objectClass=posixGroup)(cn=staff))" memberUid gidNumber
ldapsearch -x "(&(objectClass=posixGroup)(cn=staff))" memberUid
ldapsearch -x "(&(objectClass=posixGroup)(cn=staff))" gidNumber

But they’re not – all of them, except the last one, work. The pattern that seems to be followed here, and in other tests, is that if the list of attributes requested in the query is wholly contained in a previously defined attrset then it won’t work.

It seems there are two ways around this:

  1. If possible, make your template use an attrset which is defined earlier. This is what I did previously for a required attrset/template which only had one attr – gidNumber – it only worked if I made the template use a previous attrset which also contained gidNumber. The obvious drawback here is that you may be requesting/caching large numbers of attributes that you don’t want.
  2. Change the ordering of your attribute sets. This would seem to be the preferred solution, but won’t always be possible
Posted in LDAP | Comments Off on slapo-pcache and attribute lists

Cosign/Apache interaction

For too long now, Neil, Roger and I have occasionally looked at the
way cosign behaves (or doesn’t) with apache, only to end up looking at
the same thing a few months later. I’ll attempt here to note down
specifically what the problems are.

Testing is being done on kant and melody. They’re configured slightly
differently, in that kant is configured solely through apacheconf
while melody uses the apache component, so provides its own

What we’re seeing is strange behaviour when attempting to use cosign
authentication when combined with host-based restrictions.

I’m doing my testing mainly using Safari on Leopard and Firefox on

I’ve set up an area on kant: /var/www/html/restricted with an htaccess

First, let’s test cosign auth:

CosignProtected On
AuthType Cosign
Require valid-user

… this works fine – I get redirected to weblogin and, following
authentication, can see the restricted page.

So, let’s test host-based access:

order deny,allow
deny from all
allow from

… works fine – connecting from – OK, connecting
from home – 403 forbidden.

Now, combining host-based restrictions and cosign auth is where it
starts to fall apart:

order deny,allow
deny from all
allow from

satisfy any

CosignProtected On
AuthType Cosign
Require valid-user

This configuration should allow you in from .inf OR if you are

Access from – works fine
cosign: no, gives an internal server error, with the following in

[Thu Oct 30 15:54:04 2008] [crit] [client] configuration error: couldn’t check user. No user file?: /restricted/

Same behaviour when cosign authenticating prior to visiting
restricted area. Note that it’s necessary to shift-reload when using
safari as it seems to cache successful visits.

Peculiarly, when testing with the exact same .htaccess file on melody,
you get a 401 error, _not_ an internal server error as on kant. TODO:
test this properly when in inf (don’t want to open holes in the
firewall to melody just now).

Posted in Apache, Cosign | Comments Off on Cosign/Apache interaction