Consistent network names – virtual hardware

As described in an earlier post, we have recently enabled consistent network interface names under SL7 – for physical machines.

We have concluded that this is not practical to do for virtual hardware. The device names presented by the consistent network interface name scheme  depend on the the underlying configuration of the virtual guests. As this configuration is not under our control, the device names will be unpredictable.

SL7 – multipath and LVM

We have been working on checking support for DM multipath and LVM under SL7.2. Our first attempts at doing this under SL7.1 failed miserably as a result of unpredictable FibreChannel problems – we’ve found, in the past, that support for FibreChannel in early dot releases of new RHEL/SL major releases is flaky.

First off, we discovered that our standard LCFG SL7 platform had dmraid (software RAID) enabled as standard. This was creating a lot of excessive “noise” from the kernel at boottime as the dmraid module attempts to scan every attached block device – this was particularly noticeable on a SAN attached host with multiple routes to SAN volumes. We have disabled dmraid by default, and created a header file to pull it back in where required.

We next looked at DM multipath. Confusingly, despite the version of DM-multipath not having changed between SL6 and SL7, various parameters have changed. A template for SL7 (actually, EL7) was created, and a couple of multipath component resources added, to support the new parameters that we need to tweak. With SL6, and earlier, we added manual configuration to support our IBM disk array. There is built-in support for this array in SL7, so our manual configuration can be removed. Note, however, that we have never added configuration for the DotHill arrays – it may be that we are using inappropriate values for these (eg no_path_retry=fail rather than queue) under SL6.

Then onto LVM. Some confusing behaviour was discovered to be caused by the daemon lvmetad which is now enabled by default on SL7. For some reason, on some system boots, pvscan was returning an unknown physical volume device for a volume group – this was making the LVM component re-adding the configured physical volume to the volume group (because it couldn’t see it in the list). This in turn created a new PV UUID, and you ended up with a volume group with an increasing number of missing physical volumes. The volume group would work, however. It’s possible that running LVM in initramfs would fix this, but disabling lvmetad also fixed it. It seems that the purpose of lvmetad is to reduce the time taken at system boot to scan block devices for LVM physical volumes. We don’t have so many physical devices that an LVM  scan at boottime takes ages so running lvmetad seems unnecessary. However, we may need to revisit this in the future?

In SL6 and earlier, multipath and LVM configuration was required to be loaded into the initrd. Both the multipath and lvm components would trigger a rebuild of the initrd (via the kernel component) whenever their configuration changed. It looks like this is not necessary for SL7, even where multipath provisioned filesystems are mounted in /etc/fstab. We have done lots of testing, but it’s still possible that we’ve just been lucky with the timing. If we do need to load multipath and/or LVM configuration into the initrd, we will need to consider how best to do this. Under SL7, dracut will no longer automatically upload LVM and multipath configuration into the initrd : modifications to the kernel, lvm and multipath components may all be required.