multipathd drops paths of a temporarily lost device
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu on IBM z Systems |
Fix Released
|
Undecided
|
Unassigned | ||
multipath-tools (Ubuntu) |
Fix Released
|
High
|
Canonical Server |
Bug Description
== Comment: #0 - Thorsten Diehl <email address hidden> - 2016-02-01 08:57:28 ==
# uname -a
Linux s83lp31 4.4.0-1-generic #15-Ubuntu SMP Thu Jan 21 22:19:04 UTC 2016 s390x s390x s390x GNU/Linux
# dpkg -s multipath-
Version: 0.5.0-7ubuntu9
# cat /etc/multipath.conf
defaults {
default_
user_
path_
dev_loss_tmo 2147483647
fast_
}
blacklist {
devnode '*'
}
blacklist_
devnode "^sd[a-z]+"
}
-------
On a z Systems LPAR with a single LUN, 2 zfcp devices, 2 storage ports, and the following multipath topology:
mpatha (36005076304ffc
size=1.0G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
|- 0:0:0:1079001136 sda 8:0 active ready running
|- 0:0:1:1079001136 sdb 8:16 active ready running
|- 1:0:0:1079001136 sdc 8:32 active ready running
`- 1:0:1:1079001136 sdd 8:48 active ready running
I observed the following:
When I deconfigure one of the two zfcp devices (e.g. via chchp -c 0, or directly on the HMC), the multipathd removes the two paths via these devices from the pathgroup after 10 seconds. When the zfcp devices comes back, it runs through zfcp error recovery and is being set up properly, and also the mid layer objects are looking fine. However, the multipathd does not add them to the path group again.
Expected behaviour: multipathd does not remove the paths from topology list, but holds them as "failed faulty offline" until dev_loss_tmo timout is reached (which is infinite here).
I discussed this already with zfcp development, and it looks most likely as a problem with multipathd, rather than zfcp or mid-layer.
Easy to reproduce: you need two zfcp devices, one LUN, and preferably two ports on the storage server (WWPNs). Configure LUN via 2 zfcp devices * 2 WWPNs = 4 paths.
This can be also reproduced on a z/VM guest. Instead of configuing the CHPID off, just detach one zfcp device and re-attach it after 30....60 seconds. Same problem.
tags: | added: architecture-s39064 bugnameltc-136376 severity-high targetmilestone-inin1604 |
Changed in ubuntu: | |
assignee: | nobody → Skipper Bug Screeners (skipper-screen-team) |
affects: | ubuntu → multipath-tools (Ubuntu) |
Changed in multipath-tools (Ubuntu): | |
assignee: | Skipper Bug Screeners (skipper-screen-team) → Dimitri John Ledkov (xnox) |
Changed in multipath-tools (Ubuntu): | |
assignee: | Dimitri John Ledkov (xnox) → Canonical Server Team (canonical-server) |
Changed in ubuntu-z-systems: | |
status: | New → Fix Released |
tags: | added: s390x |
Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https:/ /wiki.ubuntu. com/Bugs/ FindRightPackag e. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.
To change the source package that this bug is filed about visit https:/ /bugs.launchpad .net/ubuntu/ +bug/1540407/ +editstatus and add the package name in the text box next to the word Package.
[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]