LCFG Client Refactor: New om interface

May 13, 2013

The LCFG client uses the om command line tool to call the configure method for an LCFG component when the resources change. Up until now this has been done using backticks which is not the best approach, particularly given that this involves building a command string and launching a full shell. I’ve now added a new Perl module to help with running om commands from perl. It’s in version 0.7.1 of lcfg-om, you can use it like this:

use LCFG::Om::Command;

my ( $status, $stdout, $stderr ) =
    LCFG::Om::Command::Run( "updaterpms", "run", "-Dv", "-t" );

The parameters are: component, method, ngeneric args, component args. You only need to specify the component and method names, the other two are optional. The argument options can either be simple strings or references to lists.

The status will be true/false to show the success of the command. You also get any output to stdout and stderr separately.

If you’re concerned that some method might not complete in a reasonable amount of time you can specify a timeout for the command:

my ( $status, $stdout, $stderr ) =
    LCFG::Om::Command::Run( "openafs", "restart", "-Dv", "", $timeout );

If the timeout is reached then the Run command dies, you need to use eval or a module like Try::Tiny to catch that exception.

Nicely this will also close file-descriptors 11 and 12 which are used internally by the LCFG ngeneric framework for logging. This will avoid daemons becoming asociated with those files when they are restarted (and consequently
tieing up the rdxprof process).

This is one of those nice situations where fixing a problem for one project has additional benefits for others. The trick here was in realising that the code should be added to the lcfg-om project rather than it just being in the LCFG client code base.


LCFG Client Refactor: File and Directory paths

May 9, 2013

The way in which the LCFG client handles paths to files and directories has never been pleasant. The old code contains a lot of hardwired paths inserted at package build-time by CMake using autoconf-style macros (e.g. @FOO@). This makes the code very inflexible, in particular, there is no way for a user to run the rdxprof script as a non-root user unless they are given write access to all directories and files in the /var/lcfg/conf/profile tree. There is no good reason to prevent running of rdxprof as a normal user, if they are authorized to access the XML profile then they should be allowed to run the script which parses the file and generates the DBM and RPM configuration files. They may not be able to run in full daemon mode and control the various components but one-shot mode certainly should be functional.

There are a couple of other things added into the mix which complicate matters further. Especially, there is some support for altering the root prefix for the file-system. This is used during install time where we are running from a file-system based in / as normal but installing to a file-system based in /root. I say some support since it seems that only certain essential code paths were modified.

I needed to come up with a universal solution for these two problems which could provide a fairly straightforward interface for locating files and directories. It had to neatly encapsulate the handling of any root prefix and allow non-root users to be able to store files. To this end I’ve introduced a new module, named LCFG::Client::FileLocator, which provides a class from which a locator object can be instantiated. There are instance attributes for the root prefix and the configuration data directory path (confdir) which can be set using rdxprof command line options. This object can be used to look up the correct path for any file which the LCFG client requires. There are basic methods for finding various standard LCFG paths and also useful higher-level methods for finding files for specific hosts or particular components. It’s got comprehensive documentation too so hopefully it will be a lot easier to understand in 10 years time than the previous code.

I’ve now completed stage 8 but will have to go back and finish stage 7 "Improve option handling", I would still like to try to add in configuration file handling. It’s a lot easier now that I’ve worked out the best way to deal with the various file paths. Having a single option for altering the configuration data directory was particularly useful.

So far I reckon I’ve spent just under 13 days of effort on the project. The allocation up to this point was 11 days (I have done the bulk of stage 7 though which takes it up to 12 days allocated). So, it’s still drifting away from the target a bit but not substantially.


LCFG Client Refactor: The joy of tests

May 9, 2013

I’m currently working on stage 8 of my project plan – "Eradicate hard wired paths". I’ll blog about the gory details later but for now I just wanted to show how the small number of tests I already have in place have proved to be very useful. As part of this work I have introduced a new module – LCFG::Client::FileLocator – which is nearly all new code. Having created this module I started porting over the rest of the client code to using it for file and directory path lookups. As I already had some tests I was able to gauge my progress by regularly running the test suite. As well as showing up the chunks of old client code which still needed to be ported it revealed bugs in 8 separate lines of code in the new FileLocator code. Finding these bugs didn’t require me to write a whole new set of tests for the new code (although that is on my todo list to ensure better coverage). For me that really shows the true value of writing some tests at the beginning of a refactoring process. It definitely produced higher quality code and the porting took much less time than it would have otherwise done.


LCFG Client Refactor: Logging

May 8, 2013

The next stage of untangling the LCFG client code was to improve the logging system. Up till now it has just been done using a set of subroutines which are all declared in the LCFG::Client module. Using the logging code in any other module then requires the loading of that module, this accounts for the bulk of all the inter-dependencies between the main LCFG::Client module and all the others. With a single purpose the logging code is an obvious target for separation into a distinct sub-system.

With the logging code I felt that the best approach was to convert it into an object-oriented style. The typical way that logging is done in various Perl logging modules (e.g. something like Log::Log4perl) is to have a singleton logging object which can be accessed anywhere in the code base. The advantage of this is that it is not necessary to pass around the logging object to every subroutine where it might be needed but we can still avoid creating a new object every time it is required. If the code base was fully object-oriented we might be better served having it as an instance attribute (this is what MooseX::Log::Log4perl provides) but we don’t have that option here. The logging object can be configured once and then used wherever necessary. For simplicity of porting, for now, I have made it a global variable in each Perl module, that’s not ideal but it’s a pragmatic decision to help with the speed of porting from the old procedural approach.

The new LCFG::Client::Log module does not have a new method. To make it clear that we are not creating a new object every time it instead has a GetLogger method. If no object has previously been instantiated then one is created, otherwise the previous object is returned. Again this can be done easily using the new state feature in Perl 5.10, like this:


sub GetLogger {
    my ($class) = @_;

    use feature 'state';

    state $self = bless {
        daemon_mode => 0,
        verbose     => 0,
        abort       => 0,
        debug_flags => {%debug_defaults},
        warn_flags  => {%warn_defaults},
    }, $class;

    return $self;
}

This new OO-style API neatly encapsulates all the logging behaviour we require. Previously a few variables in the LCFG::Client module had to be made universally accessible so that they could be queried. The new module provides accessor methods instead to completely hide the internals. This all helps to make it possible to simply extend or switch to a more standard framework at some point in the future if we so desire.


LCFG Client Refactor: context handling

May 3, 2013

Now that the basic tidying is complete the code is in a much better condition for safely making larger scale changes. One of the particular issues that need to be tackled is the coupling between modules. The current code is a tangled web with modules calling subroutines from each other. This makes the code harder to understand and much more fragile than we would like. There has been some attempt to separate functionality (e.g. Build, Daemon, Fetch) but it hasn’t entirely worked. For instance, in some cases subroutines are declared in one module but only used in another. The two main areas I wanted to concentrate on improving are context handling and logging.

Various client Perl modules have an interest in handling LCFG contexts and there is also a standalone script (setctx) used for setting context values. (For a good description of what contexts are and how they can be used see section 5.2.5 of the LCFG Guide). The context code was spread across rdxprof and LCFG::Client but in a previous round of tidying it was all merged into LCFG::Client. Ideally it should be kept in a separate module, this allows the creation of a standard API which improves code isolation by hiding implementation details. This in turn provides much greater flexibility for the code maintainer who can more easily make changes when desired.

The easiest part of this work was the shuffling of the context code into a new module (named LCFG::Client::Contexts). Once this was done I then took the chance to split down the code into lots of smaller chunks and remove duplication of functionality wherever possible (the code has gone from 4 big subroutines to 24 much smaller ones). This has resulted in a rather big increase in the amount of code (884 insertions versus 514 deletions) which is normally seen as a bad thing when refactoring but I felt in this case it was genuinely justified. Each chunk is now easier to understand, test and document – we now have a low-level API as well as the previous high-level functions. Also most of the subroutines are now short enough to view in a single screen of your favourite editor, that hugely helps the maintainer.

An immediate benefit of this refactoring work was seen when I came to look at the setctx script. There had been a substantial amount of duplication of code between this and the rdxprof script. As the context code was previously embedded in another script it was effectively totally inaccessible – the mere act of moving it into a separate module made it reusable. Breaking down the high-level subroutines into smaller chunks also made it much easier to call the code from setctx and remove further duplication. Overall setctx has dropped from 213 lines of code to 140 (including whitespace in both cases). Functionality which is implemented in scripts is very hard to unit test compared with that stored in separate modules. So it’s now much easier to work with the context code and know that setctx won’t suddenly break.


LCFG Client Refactor: Initial tidying done

April 25, 2013

A quick dash through the code in the LCFG::Client::Fetch module (which is relatively small and fairly straightforward) means that all the LCFG client code has been checked using perlcritic and improved where necessary/possible. This completes the work for stages 4 and 5 of the project plan.

Some of the work involved in this stage has been rather more complex than anticipated. Mostly that was not related directly to resolving issues highlighted by perlcritic. In the main it was because whilst investigating the issues raised I spotted other, big problems with sections of code that I felt needed to be resolved. Those could have been kept separate and done as an additional stage in the project plan but I thought it was better to just do them. In particular, I have made large improvements to the sending and receiving of UDP messages for notifications and acknowledgements. I’ve also improved the logic involved with handling the “secure mode” and split out lots of sections of code into smaller, more easily testable, chunks.

This takes the effort expended up to about 7 days. That’s about 1 day over what I had expected but that was accounted for in stages 1 and 2, the gap between predicted and actual effort requirements has not worsened.


LCFG Client Refactor: Storing state

April 24, 2013

Having spent a while looking at the LCFG client code now it is clear that much of it would benefit from being totally restructured as a set of Object-Oriented classes (probably Moose-based). Making such a big change is beyond the scope of this project but there is still a need to store state in a sensible fashion. Currently the code has a heavy dependence on global variables which are scoped at the module level. In many ways the modules are being used like singleton objects and most of the globals are not accessible from outside of the parent module so it’s not as bad as it could be. The biggest issue with these globals is initialisation, where multiple subroutines need to use a global they all dependent on one of them having initialised the variable first. We are thus in a situation where the order in which the subroutines are called is important. This is bad news for anyone wanting to be able to fully understand the code, it also makes it impossible to test each subroutine in an isolated fashion (i.e. given this input, do I get the right output).

With the move to SL6 we got an upgrade to perl to 5.10, this is still utterly ancient but it does provide a few new handy features. The one I’ve begun using a fair bit is the state function which is used similarly to my. The difference is that these variables will never be reinitialized when a scope is re-entered (whereas my would reinitialize the value every time). This makes it possible to write subroutines which act a bit like Object-Oriented accessors with the values being set to a sensible default value where necessary. I’ve used this to nicely handle the acknowledgement and notification port global variables. Here’s an example:

use feature 'state';

sub AckPort {
    my ($value) = @_;

    # Default: Either from /etc/services or hardwired backup value
    state $ack_port = getservbyname( 'lcfgack', 'udp' ) 
                         // $DEFAULT_PORT_ACK;

    # Allow user to override
    if ( defined $value ) {
        $ack_port = $value;
    }

    return $ack_port;
}

Note that the state feature needs to be specifically enabled to use this approach. On the first call to the AckPort function the $ack_port variable will be initialised. If the getservbyname function returns an undefined value (i.e. the named service was not found) then the default value will be used. If the caller specifies a value then that will override the port number. On subsequent calls the initialisation is not done and the current value will be returned. This provides a public API for getting and setting the port number with simple handling of the default value. There is no issue of needing to know in what sequence of subroutines this method will be called, all functionality is neatly encapsulated. The method is also easily testable. Overall an Object-Oriented approach would be better but this is a good halfway house.


LCFG Client Refactor: Sending acks

April 24, 2013

Part of the functionality in the LCFG::Client::Daemon code is to send acknowledgement messages to the LCFG servers whenever a new profile has been applied. The ack is sent via UDP using the SendAck method. The original code to do this took the traditional C-style approach:

  return ::TransientError("can't open UDP socket",$!)
    unless (socket(ACKSOCK,PF_INET,SOCK_DGRAM,$proto));

  return ::TransientError("can't bind UDP socket",$!)
    unless (bind(ACKSOCK,sockaddr_in(0,INADDR_ANY)));
  
  my $addr = inet_aton($name);
  return ::DataError("can't determine host address: $name") unless ($addr);
  
  my $res = send(ACKSOCK,$msg,0,sockaddr_in($aport,$addr));
  return ::TransientError("can't send notification: $name",$!)
    unless ($res == length($msg));

with a smattering of weirdness and unless thrown in for good measure. Things have moved on a bit since the days when this was the recommended approach. There is now a very handy suite of modules in the IO::Socket namespace which can handle the dirty work for us. The replacement code looks like this:

   my $port = AckPort();

    my $socket = IO::Socket::INET->new (
        PeerAddr   => $server,
        PeerPort   => $port,
        Proto      => 'udp',
    ) or return LCFG::Client::TransientError(
             "can't connect to $server:$port", $! );

    my $res = $socket->send($message);

    $socket->close();

    if ( $res != length $message ) {
        return LCFG::Client::TransientError(
                  "can't send notification: $server", $! );
    }

That is, without a doubt, much easier to read and maintain. We are now relying on someone else to do the socket handling but that’s fine as this is a core Perl module which should be totally reliable.


LCFG Client Refactor: Daemon state tables

April 17, 2013

Having finished the tidying of the LCFG::Client::Build module I have now moved onto LCFG::Client::Daemon. The first thing which caught my eye was the handling of the state tables. These state tables are used to control how the daemon handles notifications from the server, timeouts and signals from the LCFG client component. I pretty much totally rewrote the MakeTable function so that it processed the input text and built the data structures for the tables in a much cleaner and more comprehensible manner. As with previous changes, my first step was to write some tests which checked the old function then ran them again with the new code to ensure I had not altered the API. I also introduced a new NextStateInTable function which contained code which was previously duplicated inside NextState. Finally I introduced an InitStateTables function which is called from within ServerLoop which hides the initialisation of the global variables used to hold the state tables. This means we now have a much cleaner API for handling all the state transitions based around smaller, testable functions.


LCFG Client Refactor: tidying LCFG::Client::Build

April 17, 2013

The LCFG::Client::Build module is the largest part of the LCFG client code. It weighs in at 1800 lines which is nearly 50% of all the code in the project. It contains a lot of functionality related to processing the data from the XML profile into the format stored in the local DB file and triggering components to reconfigure as necessary. Improving this code was always going to be a big task but at least once this module is done the remainder will seem easy.

The main changes which stand out are, like with LCFG::Client, related to noticing repeated coding of the same functionality. The first larger change came from noticing that in many places the value of an attribute (for example, the LCFG resource value) are decoded using the HTML::Entities module but only for LCFG profile version 1.1 and newer. Now we probably haven’t supported anything older than this for a very long time but it occurred to me that rather than just drop the version check it would be better to completely enhance the attribute decoding. So, rather than have calls to HTML::Entities::decode all over the place we now pass the value through a new DecodeAttrValue function which in turn calls a new NeedsDecode function to check if decoding is required. These are both small easily testable functions so I added a few tests along the way. The big benefit here is that if we now ever need to change the encoding/decoding of values and increment the profile version we are already prepared for the necessary code modifications.

The second big change was to improve the code of the InstallDBM function. This had two copies of a complex regular expression used to parse a fully-qualified resource name (e.g. host.component.resource_name) so I moved this code into a new function named ParseResourceName. Again this is now easily reusable and testable whereas before it was buried in the midst of other complex code. This led to some other improvements in how the debugging was done, I noticed there were many calls to KeyType which was just returning a prettified name for the underlying attribute type indicators which are all single characters (in the set [#%=^]). Each debug statement was very similar but handled a slightly different case, these were all merged into a ResourceChangesDebug function. This new function massively improves code readability and also improves efficiency since it only actually does something when the "changes" debug option is enabled. By reworking the debugging it is now possible to use the KeyType function in a totally generic manner. Anything which needs to know about the type of the attribute can work with the named versions rather than the almost-meaningless single character indicators.

There is still a lot more to do on this module to really improve the code standards but much of that might well be beyond the scope of this initial code cleanup project. The XML profile parsing and the DB handling are particularly in need of attention.