multipathd fails to create mappings when multipath.conf is present

Bug #1178721 reported by Jernej Jakob
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
multipath-tools (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

I have a SAN connected via a dual-port HBA that I'm trying to get to work with multipath.

I'm running 12.04.2 installed via netboot (clean install).
Setup options: partitioning=guided with LVM (onto internal DRAC 6/i RAID) packages=Basic server,OpenSSH,Virtual machine host

Out of the box things work okay:

root@PE2950:~# multipath -ll
3600c0ff000d5ae56aabb855101000000 dm-3 HP,MSA2012fc
size=1.4T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 3:0:0:1 sdb 8:16 active ready running
  `- 4:0:1:1 sdc 8:32 active ready running

mappings are created:

root@PE2950:~# ll /dev/mapper/
total 0
drwxr-xr-x 2 root root 140 maj 10 16:57 ./
drwxr-xr-x 16 root root 4460 maj 10 16:57 ../
brw-rw---- 1 root disk 252, 2 maj 10 16:57 360022190ba84f800113787c003d70474
lrwxrwxrwx 1 root root 7 maj 10 16:57 3600c0ff000d5ae56aabb855101000000 -> ../dm-3
crw------- 1 root root 10, 236 maj 9 23:13 control
lrwxrwxrwx 1 root root 7 maj 9 23:14 PE2950-root -> ../dm-1
lrwxrwxrwx 1 root root 7 maj 9 23:14 PE2950-swap -> ../dm-0

Then I created /etc/multipath.conf with following contents (mostly copied from running multipathd config):

multipath.conf:
------------------
defaults {
        verbosity 2
}

blacklist {
        devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
        devnode "^hd[a-z]"
        devnode "^dcssblk[0-9]*"
        devnode "^cciss!c[0-9]d[0-9]*"
        device {
                vendor "DGC"
                product "LUNZ"
        }
        device {
                vendor "EMC"
                product "LUNZ"
        }
        device {
                vendor "IBM"
                product "S/390.*"
        }
        device {
                vendor "IBM"
                product "S/390.*"
        }
        device {
                vendor "STK"
                product "Universal Xport"
        }
        device {
                vendor "DELL"
                product "PERC.*"
        }
}

blacklist_exceptions {
}

devices {
        device {
                vendor "HP"
                product "MSA2[02]12fc|MSA2012i"
                path_grouping_policy multibus
                getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
                path_selector round-robin 0
                path_checker tur
                checker tur
                features "0"
                hardware_handler "0"
                prio const
                failback immediate
                no_path_retry 18
                rr_min_io 100
        }
}

multipaths {
}

(point is in trying to turn off the feature 'queue_if_no_path')

root@PE2950:~# ll /etc/multipath.conf
-rw-r--r-- 1 root root 0 maj 10 17:14 /etc/multipath.conf

Have to reboot:

root@PE2950:~$ reboot

Things get ugly - no more mappings:

root@PE2950:~# multipath -ll
root@PE2950:~#
root@PE2950:~# ls -la /dev/mapper/
total 0
drwxr-xr-x 2 root root 100 maj 10 17:23 .
drwxr-xr-x 16 root root 4420 maj 10 17:23 ..
crw------- 1 root root 10, 236 maj 10 17:22 control
lrwxrwxrwx 1 root root 7 maj 10 17:22 PE2950-root -> ../dm-1
lrwxrwxrwx 1 root root 7 maj 10 17:22 PE2950-swap -> ../dm-0

syslog:
--snip--
May 10 17:23:33 PE2950 udevd[380]: '/lib/udev/dmsetup_env 252 2'(err) 'Device does not exist.'
May 10 17:23:33 PE2950 udevd[380]: '/lib/udev/dmsetup_env 252 2'(err) 'Command failed'
May 10 17:23:33 PE2950 scsi_id[1156]: custom logging function 0x7f93cbe8d010 registered
May 10 17:23:33 PE2950 udevd[380]: '/lib/udev/dmsetup_env 252 2'(err) 'Device does not exist.'
May 10 17:23:33 PE2950 udevd[380]: '/lib/udev/dmsetup_env 252 2'(err) 'Command failed'
May 10 17:23:33 PE2950 udevd[380]: '/lib/udev/dmsetup_env 252 2'(out) 'DM_NAME='
May 10 17:23:33 PE2950 udevd[380]: '/lib/udev/dmsetup_env 252 2' [1154] exit with return code 0
--snip--
May 10 17:23:33 PE2950 kernel: [ 90.706237] device-mapper: multipath round-robin: version 1.0.0 loaded
May 10 17:23:33 PE2950 kernel: [ 90.706403] device-mapper: table: 252:2: multipath: not enough path parameters
May 10 17:23:33 PE2950 kernel: [ 90.706405] device-mapper: ioctl: error adding target to table
--snip--
May 10 17:23:33 PE2950 udevd[379]: no db file to read /run/udev/data/+bdi:252:2: No such file or directory
May 10 17:23:33 PE2950 udevd[378]: '/sbin/blkid -o udev -p /dev/.tmp-block-252:2'(err) 'error: /dev/.tmp-block-252:2: No such device or address'
--snip--
May 10 17:23:33 PE2950 udevd[378]: '/lib/udev/dmsetup_env 252 2'(err) 'Device does not exist.'
May 10 17:23:33 PE2950 udevd[378]: '/lib/udev/dmsetup_env 252 2'(err) 'Command failed'
May 10 17:23:33 PE2950 udevd[378]: '/lib/udev/dmsetup_env 252 2'(err) 'Device does not exist.'
May 10 17:23:33 PE2950 udevd[378]: '/lib/udev/dmsetup_env 252 2'(err) 'Command failed'
May 10 17:23:33 PE2950 udevd[378]: '/lib/udev/dmsetup_env 252 2'(out) 'DM_NAME='
May 10 17:23:33 PE2950 udevd[378]: '/lib/udev/dmsetup_env 252 2' [1173] exit with return code 0
--snip--
May 10 17:23:33 PE2950 udevd[1183]: starting '/sbin/dmsetup info -j 252 -m 2 -c --nameprefixes --noheadings --rows -o suspended'
May 10 17:23:33 PE2950 udevd[378]: '/sbin/dmsetup info -j 252 -m 2 -c --nameprefixes --noheadings --rows -o suspended'(err) 'Device does not exist.'
May 10 17:23:33 PE2950 udevd[378]: '/sbin/dmsetup info -j 252 -m 2 -c --nameprefixes --noheadings --rows -o suspended'(err) 'Command failed'
May 10 17:23:33 PE2950 udevd[378]: '/sbin/dmsetup info -j 252 -m 2 -c --nameprefixes --noheadings --rows -o suspended' [1183] exit with return code 1
--snip--
May 10 17:23:33 PE2950 kernel: [ 90.712666] device-mapper: table: 252:2: multipath: not enough path parameters
May 10 17:23:33 PE2950 kernel: [ 90.712668] device-mapper: ioctl: error adding target to table
May 10 17:23:33 PE2950 kernel: [ 90.713856] device-mapper: table: 252:2: multipath: not enough path parameters
May 10 17:23:33 PE2950 kernel: [ 90.713858] device-mapper: ioctl: error adding target to table
May 10 17:23:33 PE2950 udevd[379]: passed -1 bytes to socket monitor 0x7f133e032330
May 10 17:23:33 PE2950 kernel: [ 90.714250] device-mapper: table: 252:2: multipath: not enough path parameters
May 10 17:23:33 PE2950 kernel: [ 90.714252] device-mapper: ioctl: error adding target to table
May 10 17:23:33 PE2950 kernel: [ 90.717242] device-mapper: table: 252:2: multipath: not enough path parameters
May 10 17:23:33 PE2950 kernel: [ 90.717245] device-mapper: ioctl: error adding target to table
May 10 17:23:33 PE2950 kernel: [ 90.717840] device-mapper: table: 252:2: multipath: not enough path parameters
May 10 17:23:33 PE2950 kernel: [ 90.717842] device-mapper: ioctl: error adding target to table
May 10 17:23:33 PE2950 kernel: [ 90.747257] device-mapper: table: 252:2: multipath: not enough path parameters
May 10 17:23:33 PE2950 kernel: [ 90.747259] device-mapper: ioctl: error adding target to table

(the whole syslog is attached)

When things were working, I also tried creating a multipath storage pool in virt-manager pointing to /dev/mapper, and now libvirtd is crashing. (should I file a seperate bug report for this? or wait if this turns out to be a related problem)

May 10 17:04:26 PE2950 kernel: [64283.865145] init: libvirt-bin main process (24372) killed by SEGV signal
May 10 17:04:26 PE2950 kernel: [64283.865186] init: libvirt-bin main process ended, respawning
May 10 17:04:26 PE2950 kernel: [64284.030613] init: libvirt-bin main process (24469) killed by SEGV signal
May 10 17:04:26 PE2950 kernel: [64284.030645] init: libvirt-bin main process ended, respawning
May 10 17:04:26 PE2950 kernel: [64284.189976] init: libvirt-bin main process (24515) killed by SEGV signal
May 10 17:04:26 PE2950 kernel: [64284.190002] init: libvirt-bin main process ended, respawning
May 10 17:04:26 PE2950 kernel: [64284.353151] init: libvirt-bin main process (24559) killed by SEGV signal
May 10 17:04:26 PE2950 kernel: [64284.353182] init: libvirt-bin main process ended, respawning
May 10 17:04:27 PE2950 kernel: [64284.520619] init: libvirt-bin main process (24613) killed by SEGV signal
May 10 17:04:27 PE2950 kernel: [64284.520645] init: libvirt-bin main process ended, respawning
May 10 17:04:27 PE2950 kernel: [64284.682288] init: libvirt-bin main process (24657) killed by SEGV signal
May 10 17:04:27 PE2950 kernel: [64284.682321] init: libvirt-bin main process ended, respawning
May 10 17:04:27 PE2950 kernel: [64284.844660] init: libvirt-bin main process (24701) killed by SEGV signal
May 10 17:04:27 PE2950 kernel: [64284.844690] init: libvirt-bin main process ended, respawning
May 10 17:04:27 PE2950 kernel: [64285.006189] init: libvirt-bin main process (24745) killed by SEGV signal
May 10 17:04:27 PE2950 kernel: [64285.006214] init: libvirt-bin main process ended, respawning
May 10 17:04:27 PE2950 kernel: [64285.195574] init: libvirt-bin main process (24789) killed by SEGV signal
May 10 17:04:27 PE2950 kernel: [64285.195605] init: libvirt-bin main process ended, respawning
May 10 17:04:27 PE2950 kernel: [64285.366351] init: libvirt-bin main process (24833) killed by SEGV signal
May 10 17:04:27 PE2950 kernel: [64285.366379] init: libvirt-bin main process ended, respawning
May 10 17:04:28 PE2950 kernel: [64285.528396] init: libvirt-bin main process (24892) killed by SEGV signal
May 10 17:04:28 PE2950 kernel: [64285.528426] init: libvirt-bin respawning too fast, stopped

I reproduced the bug with a virtual machine install without the SAN, so this is definitively a problem with multipath-tools or udev.

Steps to reproduce:
1.create VM with three disks: virtio for root, two SCSI pointing to same file for multipath
(same file is not really neccesary, as it detects as two seperate disks anyway)
2.install 12.04.2 amd64(from PXE netboot)
select guided partitioning with LVM
install packages Basic Ubuntu server,OpenSSH,Virtual machine host
3.after booting create multipath.conf, then reboot
4.bug appears

Revision history for this message
Jernej Jakob (jjakob) wrote :
Revision history for this message
Jernej Jakob (jjakob) wrote :
Revision history for this message
Peter Petrakis (peter-petrakis) wrote :

https://help.ubuntu.com/12.10/serverguide/multipath-devices.html

You cannot have a default LVM config co-exist with multipath. They will
race to grab the SD device for map creation and whoever gets there first wins;
they both use device mapper. LVM is configured using UDEV so it's the
first pass, multipath in run by legacy rc script.

You must configure LVM to blacklist all SD devices.
Then, you must configure LVM to use allow the disks necessary for the
root volume using the UDEV names e.g. /dev/disk/by-id

SD names are never deterministic, sorry, that's what udev us for.

When the multipath volume is not root, you can easily test the config
per the documentation.

Any time a change is made to either multipath.conf or lvm.conf, the initrd
must be updated; also in the documentation. Multipath usually runs last, so if
you didn't push the lvm blacklist to the ramdisk, by the time multipath runs there's
nothing left for it to attach to.

Your "virtual multipath" is bogus, there's nothing that will arbitrate write
ordering, the block names aren't deterministic, and they never can be because
they don't have make/model/serial# which udev uses to create unique names.
You could easily "multipath" your root disk + a MP leg and be left with
swiss cheese based on scan ordering or hotplug events.

If you remove LVM from the stack and follow the directions for properly
propagating multipath to the ramdisk, it will work. Start small, then add an
additional layer of complexity.

There does not exist a tool which integrates the deployment of LVM + multipath
anywhere, you have to understand both and configure by hand. It is challenging
to deploy an LVM config that co-exits with multipath. Good luck.

Changed in multipath-tools (Ubuntu):
status: New → Confirmed
Changed in multipath-tools (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
Jernej Jakob (jjakob) wrote :

Thank you for your help. I wasn't aware of the limitation (incompatibility) caused by co-deploying LVM and multipath, it was my assumption that multipath would be configured to take precedence over LVM for having PVs on multipath block devices, as having it the other way around, as in multiple paths over LVM logical volumes would not make sense. I can assume that this would be a pretty common scenario, as for example, you may have one volume mapped from SAN over FC on which the tools that LVM offers would be valuable.

It would seem that a simple check at boot time, checking whether LVM and multipath are co-deployed, starting multipath first, letting it create paths and then adding those found to LVM's filter.

I understand however that this is in no way common usage and requires a skillful enough administrator to be aware of the possible scenarios caused.

Regarding the multipath config itself, this is the default config, functionally the same as the one proposed for MSA2012fc by HP.
The virtual one is of course bogus... not to be used, just to demonstrate the failure mode. The actual servers will have a dual-port FC HBA with each port going to separate zones on a switch, with itself being connected with 2 cables each to the 2 controllers (4 in total) of the SAN, so there will be 2 paths for each volume. This is required for this SAN in order to have redundancy for controller failure.

I just tried this with the physical server, and it still doesn't work as it should. In lvm.conf, blacklisted all the sd* devices and allowed only /dev/disk/by-id. Or should even the id's of the MP volumes be blacklisted, only allowing the one single PV's id?

Revision history for this message
Jernej Jakob (jjakob) wrote :

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?lang=en&cc=us&taskId=115&prodSeriesId=3559651&prodTypeId=18964&objectID=c01476873
This is the config recommended by HP for our SAN (2000 series is down there), mine is functionally the same. There is no way to sense the priorities of the paths for this SAN, wouldn't that mean that if one path were lost, some of the IO would get lost as well, or would the protocol sense that and retry. Just wondering.

Changed in multipath-tools (Ubuntu):
status: Triaged → Invalid
Revision history for this message
Peter Petrakis (peter-petrakis) wrote : Re: [Bug 1178721] Re: multipathd fails to create mappings when multipath.conf is present

At this point I would seriously consider obtaining commercial support. This
is clearly a support issue and not a bug and has been closed as invalid. LVM could
have easier to grok filtering but it does work, you just have to get creative by
changing the scan directory.

For example: UNTESTED

scan = [ "/dev/mapper"]
filter = [ "a|^/dev/mapper/mpath.*|", "r/.*/" ]

pvscan -vvv is your friend. Concerning your other SAN question, please
seek professional support or seek forums (google) where SAN admins gather.
We can't know how to configure *every single SAN on the market*.

Again, LVM & multipath are distro agnostic, *any other distro doc
will do*.

e.g. https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Logical_Volume_Manager_Administration/lvm_filters.html

On 06/10/2013 03:33 PM, Jernej Jakob wrote:
> Thank you for your help. I wasn't aware of the limitation
> (incompatibility) caused by co-deploying LVM and multipath, it was my
> assumption that multipath would be configured to take precedence over
> LVM for having PVs on multipath block devices, as having it the other
> way around, as in multiple paths over LVM logical volumes would not make
> sense. I can assume that this would be a pretty common scenario, as for
> example, you may have one volume mapped from SAN over FC on which the
> tools that LVM offers would be valuable.
>
> It would seem that a simple check at boot time, checking whether LVM and
> multipath are co-deployed, starting multipath first, letting it create
> paths and then adding those found to LVM's filter.
>
> I understand however that this is in no way common usage and requires a
> skillful enough administrator to be aware of the possible scenarios
> caused.
>
> Regarding the multipath config itself, this is the default config, functionally the same as the one proposed for MSA2012fc by HP.
> The virtual one is of course bogus... not to be used, just to demonstrate the failure mode. The actual servers will have a dual-port FC HBA with each port going to separate zones on a switch, with itself being connected with 2 cables each to the 2 controllers (4 in total) of the SAN, so there will be 2 paths for each volume. This is required for this SAN in order to have redundancy for controller failure.
>
> I just tried this with the physical server, and it still doesn't work as
> it should. In lvm.conf, blacklisted all the sd* devices and allowed only
> /dev/disk/by-id. Or should even the id's of the MP volumes be
> blacklisted, only allowing the one single PV's id?
>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.