ngeneric changes

Over the last month a lot of work has been done on the ngeneric (and LCFG::Component) framework which is used by LCFG components. This work has been done as part of the porting of LCFG to RHEL7. Here are outlines of the work, more details will appear on the wiki in the near future. Some of these changes will not appear until version 1.11.2 of lcfg-ngeneric in the stable release next Wednesday (2nd July).

Service Command

We now provide a convenient method, named Service, for calling actions on managed daemons (e.g. apache restart). This automatically uses the correct tool for the init daemon on the machine (e.g. upstart, systemd or SysV).

We recommend that all component authors convert their components to using this new method so that when they upgrade to EL7 it “just works”. Full details at the LCFG wiki.

IsStarted Component Method

All components have gained a new method named IsStarted which can be called like “om foo isstarted”. This will return 0 (zero) when the component is running and 1 (one) when it is stopped.

Methods can specify exit code

We now allow authors to return an integer value from the method function (e.g. Method_Configure, Method_Install), if the end is reached without errors that value will be set as the exit status.

Previously there was no control over the exit code of a component method. If you reached the end it was zero, if you failed it was non-zero.

Along with this new behaviour we now have a consistent exit status of 255 if a component fails or records an error (via Fail() or Error()). Note that the Fail method aborts immediately and is rarely a good idea since it can leave a component in a bad state.

We have seen a few side-effects from this change, particularly during the install process. We believe we have solved all the issues with the “core” components, we recommend that component authors who have added additional methods audit their code and check they are always returning sensible values.

Change in location of run, status and lock files

When a component is started there exists a “run file” which is used to indicate that it is “running” and a “status file” which records the state of the resources after the last successful call of the configure method. When a component method is actively doing work there is also usually a lock file to avoid concurrency issues.

These files have always been stored in directories within /var/lcfg and are all deleted by the lcfginit script before the boot process is begun. This is rather non-standard behaviour and is awkward to handle when using systemd. To improve the situation on EL7 we have moved the files to directories under /run. This is a tmpfs partition, it is now the standard location for these types of files and it is guaranteed to be empty at the start of each boot process.

This change has led to a big change in how we handle the LCFGSTATUS, LCFGRUN and LCFGLOCK paths. These were previously hardwired into the code when the software package was built. We now look up the paths dynamically at runtime using LCFG SysInfo.

Component authors rarely need to know the locations of these paths but in case it is necessary there are now convenience functions named RunFile and StatusFile (see also HasRunFile and HasStatusFile). Locking is now handled entirely by LCFG::Lock (or lcfglock), see that code for details if necessary.

To make life easier a new option was added to the qxprof command line tool. It is no longer necessary to specify the full path to the status file for a component with ‘-r’, you can now just specify the component name with ‘-s’ and qxprof will work out the correct location.

If you need to look up standard LCFG paths and SysInfo is not available (e.g. early in the install process) you may be able to use the LCFG::Client::FileLocator module instead. The values of SysInfo resources will always reflect the values returned by methods in that module. This allows component authors to avoid boot-strapping problems.

Minor Bug Fixes

A number of minor bugs were discovered in the locking code which have been resolved. It’s unlikely that these were causing anyone major issues.

A number of problems were found which were due to the reuse of the same variable name in different methods in the ngeneric shell code. This has been resolved with the liberal application of the ‘local’ function.

This entry was posted in Uncategorized. Bookmark the permalink.

1 Response to ngeneric changes

  1. Pingback: Service function improvements | Scientific Linux 7 LCFG port diary

Leave a Reply