proxycaching project issues

These are the current main issues in the project for implementing a proxycaching OpenLDAP solution on our clients. The current focus is on increased testing, debugging crashes and contemplating 2.3 vs 2.4.

Testing and stability

We’re currently running proxycaching clients on 122 configured machines, split between student labs (63 hosts) and DICE SL5 machines (59 hosts) following the develop release.

The lab machines have had condor running on them for a short while now. This seems to have increased the instances of slapd crashing – this was a minor problem which I’d seen on a handful of previous occasions. So far there has been no obvious pattern or likely cause – slapd just ceases to exist with nothing interesting being logged (on a few occasions it has occurred when entries are removed from the cache, but this isn’t consistent). This needs to be investigated. Turning on debugging for the lcfg-openldap component will mean we get core files when slapd crashes, so should be considered.

2.3 vs 2.4

Most of our testing to date with proxycaching has been done with OpenLDAP 2.3, although I have done some testing with each new 2.4 release as it’s come out. The 2.4 branch is now the current recommended release. So, do we stay with 2.3 or move to 2.4? In time-honoured tradition, here’s a list of pros and cons.

2.3 (current: 2.3.43)

Pros

  • We know it to be relatively stable and tested – both for proxycaching and general use – all our client and servers are running 2.3

Cons

  • No new features – bug fixes only (not sure for how long, but it’s clear the focus of the developers is almost entirely on 2.4)
  • We can’t specify template/attributes filters with “*”, as we can in 2.4 (see below).

2.4 (current: 2.4.13)

Pros

  • All new openldap development is on 2.4
  • 2.4 supports “*” attribute lists – this is extremely useful – on 2.3 you can’t cache lookups which don’t specify any attributes (an implicit request for “*” attrs). For a recent week on the lab machines, 99% of queries which couldn’t be cached under 2.3 would be able to under 2.4 using “*” attrsets.

Cons

  • Doubts over stability and testing of 2.4 proxycache code – are many other people using it? Up until 2.4.12 there was a bug in slapo-pcache which caused slapd to crash extremely frequently. Bugs like this have always been fixed promptly, but perhaps it would be wiser to wait until 2.4 matures further?
  • 2.4.12+ forces a change in the version of bdb used for the database backend. 2.4.12 requires bdb 4.4+ as a configure prerequisite. Testing indicates that 2.4.12+ with either 4.7.25 (+ 2 patches, one from sleepycat, one shipped with openldap) or 4.6.21 (+ 3 sleepycat patches) works OK so far. We’ve used 4.2.52 (+patches) for many years now so would have concerns over changing this without thorough testing.
  • ITS#5756 is a big stumbling block. This vastly reduces the effectiveness of proxycaching.

dns round-robin issues

We’d like to use the dir.inf dns round-robin as a simple means of load-balancing – currently this points to 4 LDAP servers, distributed across our server rooms. Unfortunately this won’t work because of the issues debated (at length) here. Essentially the problem is that getaddrinfo() sorts the addresses it gets from a DNS server according to RFC 3484. This doesn’t make sense for IPv4 (all our DICE machines were always getting the same server – the one which is furthest away).

This sorting behaviour can be overridden in /etc/gai.conf but before widely distributing a version of this file we would want to carefully consider any implications.

This entry was posted in LDAP. Bookmark the permalink.