LVM, SL7 and physical volume device names

The use of traditional /dev/sd[xx] disk device names has become increasingly unsafe in recent Linux installs. This is particularly so in an environment with SAN devices. For a while now, we have been mounting disk partitions by UUID, but until recently the LVM component has continued to rely on /dev/sd[xx] names.

The LVM component, on configure(), checks to see whether any additional physical volumes have been configured for a volume group. It does this by using the ‘pvs’ component to enumerate the physical volumes are associated with each volume group. The command vgextend (or vgcreate) are used to add a physical volume to a volume group. Unfortunately, these commands store away the resulting /dev/sd[xx] name of they physical volumes – and not the /dev/by-uuid name. This means that on subsequent reboots, there’s a high chance that the /dev/sd[xx] name will be wrong for a physical volume.

The solution is to generate a UUID on each physical volume (based on a hash of the physical path name), label the physical volume with that UUID (with pvcreate) and look for that UUID using ‘pvs -o pv_uuid,vg_name’ instead of looking for the /dev/sd[xx] device.

Consistent network names – virtual hardware

As described in an earlier post, we have recently enabled consistent network interface names under SL7 – for physical machines.

We have concluded that this is not practical to do for virtual hardware. The device names presented by the consistent network interface name scheme  depend on the the underlying configuration of the virtual guests. As this configuration is not under our control, the device names will be unpredictable.

SL7 – multipath and LVM

We have been working on checking support for DM multipath and LVM under SL7.2. Our first attempts at doing this under SL7.1 failed miserably as a result of unpredictable FibreChannel problems – we’ve found, in the past, that support for FibreChannel in early dot releases of new RHEL/SL major releases is flaky.

First off, we discovered that our standard LCFG SL7 platform had dmraid (software RAID) enabled as standard. This was creating a lot of excessive “noise” from the kernel at boottime as the dmraid module attempts to scan every attached block device – this was particularly noticeable on a SAN attached host with multiple routes to SAN volumes. We have disabled dmraid by default, and created a header file to pull it back in where required.

We next looked at DM multipath. Confusingly, despite the version of DM-multipath not having changed between SL6 and SL7, various parameters have changed. A template for SL7 (actually, EL7) was created, and a couple of multipath component resources added, to support the new parameters that we need to tweak. With SL6, and earlier, we added manual configuration to support our IBM disk array. There is built-in support for this array in SL7, so our manual configuration can be removed. Note, however, that we have never added configuration for the DotHill arrays – it may be that we are using inappropriate values for these (eg no_path_retry=fail rather than queue) under SL6.

Then onto LVM. Some confusing behaviour was discovered to be caused by the daemon lvmetad which is now enabled by default on SL7. For some reason, on some system boots, pvscan was returning an unknown physical volume device for a volume group – this was making the LVM component re-adding the configured physical volume to the volume group (because it couldn’t see it in the list). This in turn created a new PV UUID, and you ended up with a volume group with an increasing number of missing physical volumes. The volume group would work, however. It’s possible that running LVM in initramfs would fix this, but disabling lvmetad also fixed it. It seems that the purpose of lvmetad is to reduce the time taken at system boot to scan block devices for LVM physical volumes. We don’t have so many physical devices that an LVM  scan at boottime takes ages so running lvmetad seems unnecessary. However, we may need to revisit this in the future?

In SL6 and earlier, multipath and LVM configuration was required to be loaded into the initrd. Both the multipath and lvm components would trigger a rebuild of the initrd (via the kernel component) whenever their configuration changed. It looks like this is not necessary for SL7, even where multipath provisioned filesystems are mounted in /etc/fstab. We have done lots of testing, but it’s still possible that we’ve just been lucky with the timing. If we do need to load multipath and/or LVM configuration into the initrd, we will need to consider how best to do this. Under SL7, dracut will no longer automatically upload LVM and multipath configuration into the initrd : modifications to the kernel, lvm and multipath components may all be required.

 

Consistent network interface names and LCFG

As explained in an earlier post, we are moving to the more “modern” consistent network interface naming scheme because the old-style method of hard-wiring interfaces to interface names of the form eth0 no longer works with RHEL7. This is a problem for machines with multiple interfaces – eg servers.

(Note that you can stick with the legacy naming scheme by defining LCFG_NETWORK_LEGACY_NAMING at the head of a machine’s profile).

As a recap, under the consistent naming scheme, interfaces are known as :-

Device Name
On-board (embedded) interface em[1234…]
PCI card interface p<slot>p<port>
Virtual p<slot>p<port>_<virtualif>

For example, a minimally configured Dell R730 has four on-board interfaces : these would be called em1, em2, em3 and em4.

We could modify all our LCFG configuration to use the new names directly. However, there are many LCFG macros which assume that network interfaces are of form eth[n] and changing these would be somewhat disruptive. We have decided to stick with using the old form eth[n] in LCFG configuration, providing a means of associating these names with a real physical device. The macro call LCFG_NETWORK_SET_DEVICE(eth0,em1)  will associate the LCFG name eth0 with the physical device em1.  The hardware headers for each machine model will include calls to LCFG_NETWORK_SET_DEVICE for all the onboard network interfaces. This means that the process should be largely transparent on most machines.

Some machines have both embedded and PCI-E interfaces. For example, in Informatics, HP DL180s will commonly have two onboard interfaces and one PCI-E interface. We usually configure these machines such that a network bond is formed using one of the onboard interfaces (usually the second as the first is used for IPMI) and the PCI-E interface. On DICE, the following will be configured by default for these machines :-

LCFG_NETWORK_SET_DEVICE(eth0,p1p1)
LCFG_NETWORK_SET_DEVICE(eth1,em2)

Note that old form of eth[n] is only used in LCFG configuration – all operating system tools (eg netstat) and configuration will expect names of the form em[n] or pp.

We shall probably convert LCFG configuration to use the native network interface names throughout – possibly at the next major platform upgrade.