find_multipath_device doens't raise exception when executing "multipath -l" failed

Bug #1519363 reported by Felix Ma
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
os-brick
Fix Released
Undecided
zhangsong

Bug Description

I'm using the multipath feature.
During the volume disconnecting procedure, remove_multipath_device calls self.find_multipath_device(device), which fails to execute "multipath -l". In the end of the operation, all devices go into "failed faulty offline" state:

----
[root@compute-4 ~]# ls -l /dev/disk/by-path/ | grep 3b5a9cc3-28a8-479d-b28b-759097a05f30
lrwxrwxrwx. 1 root root 9 Nov 22 22:12 ip-10.254.4.22:3260-iscsi-iqn.2010-10.org.openstack:volume-3b5a9cc3-28a8-479d-b28b-759097a05f30-lun-1 -> ../../sdc
lrwxrwxrwx. 1 root root 9 Nov 22 22:12 ip-10.254.4.25:3260-iscsi-iqn.2010-10.org.openstack:volume-3b5a9cc3-28a8-479d-b28b-759097a05f30-lun-1 -> ../../sdh
[
root@compute-4 ~]# multipath -l /dev/mapper/mpathd
mpathd (3603b5a9cc328a8479db28b759097a05f) dm-4 IET ,VIRTUAL-DISK
size=1.0G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| `- 33:0:0:1 sdc 8:32 failed faulty offline
`-+- policy='service-time 0' prio=0 status=enabled
`- 34:0:0:1 sdh 8:112 failed faulty offline
----

I think an exception should be raised instead of printing warning only:
---
    def find_multipath_device(self, device):
        """Discover multipath devices for a mpath device.

           This uses the slow multipath -l command to find a
           multipath device description, then screen scrapes
           the output to discover the multipath device name
           and it's devices.

        """

        mdev = None
        devices = []
        out = None
        try:
            (out, _err) = self._execute('multipath', '-l', device,
                                        run_as_root=True,
                                        root_helper=self._root_helper)
        except putils.ProcessExecutionError as exc:
            LOG.warning(_LW("multipath call failed exit %(code)s"), <=== raise exception, not just print warning
                        {'code': exc.exit_code})
            return None
---

Tags: multipath
zhangsong (zhangsong)
Changed in os-brick:
assignee: nobody → zhangsong (zhangsong)
Revision history for this message
Walt Boring (walter-boring) wrote :

The only time I can see calling multipath -l <device> fails if multipath doesn't exist or isn't installed on the system, or there was a putils problem.

You can run multipath -l <device that doesn't exist> and you won't get an error from multipath. It simply returns.

Can you provide an example of seeing this failure ? Your example above worked just fine.

root@host:/mnt# multipath -l /this/does/not/exist
root@host:/mnt#

Changed in os-brick:
status: New → Incomplete
Revision history for this message
Felix Ma (felix23ma) wrote :

Hi Walt,

Here is an example - execute 'multipath -l' by an none-root user who isn't in sudoers:
---
[root@compute-2 ~]# multipath -l /dev/abcdefg
[root@compute-2 ~]# su - felix
Last login: Wed Nov 25 08:44:46 CST 2015 on pts/1
[felix@compute-2 ~]$ multipath -l /dev/abcdefg
need to be root
[felix@compute-2 ~]$ echo $?
1
[felix@compute-2 ~]$ sudo multipath -l /dev/abcdefg
[sudo] password for felix:
Sorry, try again.
[sudo] password for felix:
Sorry, try again.
[sudo] password for felix:
Sorry, try again.
sudo: 3 incorrect password attempts
[felix@compute-2 ~]$ echo $?
1
---

I'm co-working with zhangsong. We had the same problem described in https://bugs.launchpad.net/os-brick/+bug/1519405
You may take a look at his fix: https://review.openstack.org/#/c/249308/

Thanks for your time.

Changed in os-brick:
status: Incomplete → In Progress
Revision history for this message
Lisa Li (lisali) wrote :

In fact, this command is run as root. So the case you gave is not valid.

https://github.com/openstack/os-brick/blob/master/os_brick/initiator/linuxscsi.py#L263

Revision history for this message
Felix Ma (felix23ma) wrote :

If cinder isn't in sudoers, or sudoers is configured not correctly, you'll see the error.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (master)

Reviewed: https://review.openstack.org/249231
Committed: https://git.openstack.org/cgit/openstack/os-brick/commit/?id=2b051f7fa2af6fc5c62440b762c26664fcfed687
Submitter: Jenkins
Branch: master

commit 2b051f7fa2af6fc5c62440b762c26664fcfed687
Author: felix23ma <email address hidden>
Date: Tue Nov 24 22:08:11 2015 +0800

    Raise exception in find_multipath_device

    When failed to execute "multipath -l" in find_multipath_device,
    raise an exception instead of printing warning only.

    Change-Id: I593b49d1637c7077e51a2db343e5b1eec3053536
    Closes-Bug: #1519363

Changed in os-brick:
status: In Progress → Fix Released
Matt Riedemann (mriedem)
tags: added: multipath
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/os-brick 1.0.0

This issue was fixed in the openstack/os-brick 1.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.