Using OpenLDAP with slapo-pcache

For a while now I’ve been investigating the possibility of replacing our current OpenLDAP client setup with one based on OpenLDAP and slapo-pcache. In short, all of our managed linux clients (DICE) run a full copy of our slapd database, replicated from the master using our own technology. This started to become quite unreliable a couple of years ago due largely, it seems, to clients running out of memory, thus splatting slapd in the process. It’s worth noting that this problem has largely gone away due, I think, to increased memory in our clients. Even still, the current situation is clearly sub-optimal – we would prefer not to run a full slapd on all machines and also not to have to maintain our own technology in order to do this.

What I’ve been investigating is replacing this with a system where we have a small number of slaves, replicating from the master using syncrepl with clients talking to these slaves, but using back-ldap and slapo-pcache locally so that they cache as much as possible of the data from the slaves. Connections to the remote server are authenticated via gssapi.

I’m going to gather here the various strands of this work…

(slight delay…)

And here they are:

Testing rollout

The intention is to gradually introduce more and more clients into the testing pool.
Currently rolled out to:

  • student labs – currently implemented in 2 student labs – 63 machines in total
  • servers – 2 mirror servers
  • user machines – currently just used on 4 computing officer machines
  • In particular I want to see condor being used in the student labs, as these jobs can generate a hefty amount of ldap traffic – this isn’t quite ready for use locally yet. Also, I’m keen to test this on beowulf nodes – also not quite ready.

    pcache stats

  • On lab machines the stats for queries answered from the cache are generally around 70-80%
  • CO machines seems to be around 50-60%
  • OpenLDAP version

    There are two current versions of OpenLDAP – 2.3.43 and 2.4.12 – the 2.4.x branch is where all current development is happening. 2.3.x is frozen with only bug fixes, no new features. These are the current issues with respect to using pcache:

  • 2.4 can have ‘*’ as attr list, which wasn’t available under 2.3, this means that a search like ldapsearch “uid=blah” can be cached on 2.4, but not under 2.3 (an empty attr list in ldapsearch means all attrs, hence ‘*’)
  • ITS #5756 will be a major issue if not fixed, as it means we can’t set pcache filters to effectively catch much of the traffic we see.
  • Although not strictly related to pcache, an issue with 2.4 is bdb version – 2.4.12 won’t compile with bdb 4.2.52, requires 4.4+. It seems to work OK for what I’ve tried with 4.6.21, but not at all for 4.7.25 (fails with initial slapadd, I think – not investigated in any detail)
  • Impact on servers

  • Little impact on servers so far (using 2 of our slaves), however only smallish testing pool, no condor, no beowulf.
  • Would probably want to turn off logging on servers
  • Failover

    Seems to failover pretty well to next server in uri list, although I haven’t managed to create a situation where remote server is up and running (i.e. answering queries) but broken in some more interesting way.

    This entry was posted in LDAP. Bookmark the permalink.