Boot stalls on mountall "disk drive not ready" question in multipath-tools >= 0.4.9-3ubuntu7.5

Bug #1547206 reported by Tore Anderson on 2016-02-18
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
multipath-tools (Ubuntu)
High
Mathieu Trudel-Lapierre

Bug Description

System information: Cisco UCS B200M2 blade, fnic.ko HBA. The system boots from local storage, but mounts the following file system on an EMC VNX during bootup:

opt_vnx (3600601603a71320022967e0a1f38e411) dm-0 DGC,VRAID
size=50G features='1 queue_if_no_path' hwhandler='1 emc' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| |- 1:0:1:0 sdd 8:48 active ready running
| `- 0:0:0:0 sda 8:0 active ready running
`-+- policy='round-robin 0' prio=0 status=enabled
  |- 0:0:1:0 sdb 8:16 active ready running
  `- 1:0:0:0 sdc 8:32 active ready running

/etc/fstab contains:

/dev/mapper/opt_vnx /opt/vnx ext4 noatime 0 2

After upgrading the "multipath-tools" package to version 0.4.9-3ubuntu7.5 or higher, the system can no longer boot without manual intervention. Instead, the following question is asked (by mountall(8)) on the console:

The disk drive for /opt/vnx is not ready yet or not present.
keys:Continue to wait, or Press S to skip mounting or M for manual recovery

Waiting does nothing useful. Pressing "S" allows the boot to run to completion, and the "opt_vnx" *is* present when logging in to a completely booted system. However, it seems that this is discovered *after* the mountall(8) question appears and is skipped, as this log line appears later on in the boot process:

 * Discovering and coalescing multipaths... [ OK ]

Downgrading multipath-tools to version 0.4.9-3ubuntu7.4 does resolve the problem, and allows the system to boot normally without manual intervention. Note that the dependent "kpartx" package does *not* need to be downgraded also, if this is left on version 0.4.9-3ubuntu7.9 it will still boot fine.

We experience this problem on serveral other systems apart from the one described in this bug report.

Tore Anderson (toreanderson) wrote :

After reviewing the changes between -3ubuntu7.4 and -3ubuntu7.5, I have found that the problem is caused by the removal of the file /lib/udev/rules.d/95-multipath.rules. In -3ubuntu7.4, it contained the following:

#
# udev rules for multipathing.
# The persistent symlinks are created with the kpartx rules
#

# socket for uevents
RUN+="socket:/org/kernel/dm/multipath_event"

# Coalesce multipath devices before multipathd is running (initramfs, early
# boot)
ACTION=="add|change", SUBSYSTEM=="block", RUN+="/sbin/multipath -v0 /dev/$name"

If this file is manually reinstated (either to /lib/udev/rules.d or /etc/udev/rules.d), boot is again successful (even with the latest -3ubuntu7.9 multipath-tools package).

I see from the changelog that the removal was intentional:

  * debian/rules: don't ship 95-multipath.rules udev rules anymore; they are
    not necessary with multipath-tools listening for udev events directly.
  * debian/multipath.udev: removed.

However, in order for this to actually work with multipath-backed "auto" filesystems in /etc/fstab, it seems necessary to ensure that multipathd is started before mountall runs. I am not entirely certain how to accomplish this, though, as mountall is started from Upstart while multipath-tools is stuck using a legacy SysV-init, and I do not know if it is possible to enforce service ordering between the two init systems. For what it's worth, renaming /etc/rcS.d/S21-multipath-tools-boot as etc/rcS.d/S00-multipath-tools-boot does not work around the issue.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in multipath-tools (Ubuntu):
status: New → Confirmed
Steve Baker (sbaker-gre) wrote :

I am experiencing the same issue. I tried to work around the boot problem by adding the nobootwait option in fstab, and the system is able to boot that way, but I have been experiencing a plethora of issues with other services that rely on the mounted filesystem failing intermittently during boot.

Please let me know if I can provide any additional details to help.

Robie Basak (racb) wrote :

15:54 <rbasak> cyphermox: bug 1547206 claims a regression in a multipath-tools SRU I think. Please could
               you take a look?

15:55 <cyphermox> rbasak: yeah, I've already been in contact with tore on that

tags: added: regression-update
Changed in multipath-tools (Ubuntu):
status: Confirmed → Triaged
importance: Undecided → High
Steve Baker (sbaker-gre) wrote :
Download full text (24.6 KiB)

Trying to work around this for now I installed the older version on a clean build of the server.

apt-get install multipath-tools=0.4.9-3ubuntu7
apt-mark hold multipath-tools

This seemed to behave better at first, it boots without intervention for instance, but it still seems some services are not seeing the multipath device when they start up, even with a dependency on remote-filesystems in upstart.

I'm also seeing a lot of these messages as the server is booting with the older package version:

# dmesg | grep "failed to execute"
[ 3.042817] systemd-udevd[1155]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.056408] systemd-udevd[1176]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.058264] systemd-udevd[1219]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.060754] systemd-udevd[1230]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.062581] systemd-udevd[1238]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.064116] systemd-udevd[1243]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.065754] systemd-udevd[1249]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.068872] systemd-udevd[1265]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.069552] systemd-udevd[1268]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.071924] systemd-udevd[1270]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.077997] systemd-udevd[1291]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.081841] systemd-udevd[1306]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.084469] systemd-udevd[1319]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.088084] systemd-udevd[1337]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.092014] systemd-udevd[1355]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipath_event' 'socket:/org/kernel/dm/multipath_event': No such file or directory
[ 3.095213] systemd-udevd[1365]: failed to execute '/lib/udev/socket:/org/kernel/dm/multipat...

Changed in multipath-tools (Ubuntu):
assignee: nobody → Mathieu Trudel-Lapierre (mathieu-tl)
Steve Baker (sbaker-gre) wrote :

The work around to pin to the previous version of multipath-tools no longer works when building new machines. Are there any safe ways to work around this issue?

Tore Anderson (toreanderson) wrote :

Steve, some suggestions for you to try:

1) Reinstate 95-multipath.rules to /{etc,lib}/udev/rules.d as described in comment #1
2) Install the package «multipath-tools-boot»

I'm also running into this with a ZFS array built on multipath devices, the mountall script runs before multipath can create its devices, so the system doesn't mount.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers