Boot issues running multipath with a large number of paths

Bug #1467989 reported by Mauricio Faria de Oliveira on 2015-06-23
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
multipath-tools (Ubuntu)
Medium
Unassigned
Trusty
Undecided
Unassigned

Bug Description

Applicable to: Ubuntu Server 15.04, 14.10, and 14.04(.x) LTS

On systems running multipath with a very large number of paths,
Ubuntu Server might experience long delays and issues during boot
(for example, failure to mount the root filesystem or other filesystems).

The problem symptoms include:
- Boot messages about udev worker timeout events and killed processes.
- Hitting the initramfs shell (usually applicable to the root filesystem).
- Hitting the emergency mode shell (usually applicable to other filesystems).

The problem cause is:
- Path discovery time for a large number of individual paths exceeds the
  time limits defined by udev and/or systemd targets (if applicable).

The solution is expected to be available in Ubuntu Server 15.10 and later.

For Ubuntu Server 15.04 and earlier, there is currently no automatic solution
available on package updates. In order to resolve the problem manually, it is
required to include the following parameters in the boot/kernel command line:
  udev.children-max=<expected number of paths + overhead>
  udev.event-timeout=<path discovery time in seconds>

The parameter values vary according to the system, path discovery rate, and
other factors. It is suggested to repeatedly increase the parameter values
to determine the minimum values required to eliminate the problem symptoms.

For example, on a system with 816 paths:
  udev.children-max=900
  udev.event-timeout=300

The procedure can be accomplished in 2 phases:

1) Discover the appropriate values to boot the system:

   On the bootloader screen (GNU GRUB version <version>).
   a) Select the appropriate Ubuntu entry with the arrow keys.
   b) Press 'e' (edit).
   c) Scroll down until the following line with the arrow keys:
      " linux /boot/vmlinux-... root=UUID=... ro ..."
   d) Scroll to the end of the line with the arrow keys, and append
      the parameters with the respective values (without quotes):
      " udev.children-max=value1 udev.event-timeout=value2 "
   e) Press Ctrl-x or F10 to boot the entry.
   f) Repeat this process until the system boots correctly,
      increasing the parameter values.

2) Permanently store the parameters/values in the bootloader configuration

   Once the system boots correctly:
   a) Log in.
   b) Edit the /etc/default/grub file; for example:
      $ sudo nano /etc/default/grub
      or
      $ sudo vi /etc/default/grub
   c) Insert the parameters in the GRUB_CMDLINE_LINUX line; for example
      GRUB_CMDLINE_LINUX="udev.children-max=value1 udev.event-timeout=value2"
   d) Save the file, and exit.
   e) Update the bootloader configuration:
      $ sudo update-grub
   f) You can (optionally) reboot in order to test the new settings:
      $ sudo reboot

Changed in multipath-tools (Ubuntu):
status: New → Confirmed
Changed in multipath-tools (Ubuntu):
importance: Undecided → Medium
Nish Aravamudan (nacc) wrote :

@mauricfo: is this fix upstream? Where, if so?

Changed in multipath-tools (Ubuntu):
status: Confirmed → Incomplete
Nish Aravamudan (nacc) wrote :

Also, I'm not sure what this bug is driving towards -- do you want these values to be derived automatically?

Hi Nish @nacc,

> @mauricfo: is this fix upstream? Where, if so?

iirc yes, the multipath-tools code upstream could handle path discovery in parallel, which would take less time, or more efficiently in some other way -- but it's been a long time I looked at that, so not quite sure about it.

> Also, I'm not sure what this bug is driving towards -- do you want these values to be derived automatically?

iirc this bug was reported for the sake of external documentation.
we can revisit this if you'd like to.

Thanks Mauricio for the answer.
Thanks for clarifying that it was meant as external documentation!
I don't think one wants to backport the newer concurrent code to Trusty, the change would be too unstable for an SRU I'd assume. So for now I'll consider the bug "opinion", that way it can serve as a searchable doc of this issue and a workaround, but it is more clear that it isn't really actionable to "solve" it.

As a suggestion, you might consider also creating an askubuntu.com entry for it as I have seen people heading there first for common "X doesn't work what could I do" cases.

Changed in multipath-tools (Ubuntu Trusty):
status: New → Opinion
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers