------- Comment From <email address hidden> 2016-02-26 13:27 EDT-------
This seems to be the rport reference problem w/ the lpfc driver,
which makes the rport not to be discovered when it's up again,
resolved by this commit [1],
(despite the host numbers being different than those in the multipath -l of the bug report, the timing of the devloss events and the path removal events do match precisely).
[root@iltuc4-bf var_logs]# grep sdz syslog.1
<...>
Dec 2 03:16:28 ilp1fc85apA4 multipathd: uevent 'remove' from '/devices/pci0003:00/0003:00:0e.5/host6/rport-6:0-6/target6:0:4/6:0:4:0/block/sdz'
Dec 2 03:16:28 ilp1fc85apA4 multipathd: DEVNAME=/dev/sdz
Dec 2 03:16:28 ilp1fc85apA4 multipathd: DEVPATH=/devices/pci0003:00/0003:00:0e.5/host6/rport-6:0-6/target6:0:4/6:0:4:0/block/sdz
Dec 2 03:16:28 ilp1fc85apA4 multipathd: sdz: remove path (uevent)
Dec 2 03:16:28 ilp1fc85apA4 multipathd: sdz: path removed from map mpath9
[root@iltuc4-bf var_logs]# grep sdz syslog.1
<...>
Dec 2 03:16:28 ilp1fc85apA4 multipathd: uevent 'remove' from '/devices/pci0001:00/0001:00:07.1/host2/rport-2:0-7/target2:0:5/2:0:5:0/block/sdak'
Dec 2 03:16:28 ilp1fc85apA4 multipathd: DEVNAME=/dev/sdak
Dec 2 03:16:28 ilp1fc85apA4 multipathd: DEVPATH=/devices/pci0001:00/0001:00:07.1/host2/rport-2:0-7/target2:0:5/2:0:5:0/block/sdak
Dec 2 03:16:28 ilp1fc85apA4 multipathd: sdak: remove path (uevent)
Dec 2 03:16:29 ilp1fc85apA4 multipathd: sdak: path removed from map mpath4
root@iltuc4-bf var_logs]# grep lpfc syslog.1
<...>
Dec 2 03:16:28 ilp1fc85apA4 kernel: [15294.574079] lpfc 0003:00:0e.4: 4:(0):0203 Devloss timeout on WWPN 50:05:07:68:02:20:ef:26 NPort x5e00a0 Data: x0 x8 x3
Dec 2 03:16:28 ilp1fc85apA4 kernel: [15294.580629] lpfc 0003:00:0e.5: 5:(0):0203 Devloss timeout on WWPN 50:05:07:68:02:40:ef:26 NPort x020040 Data: x0 x8 x3
Dec 2 03:16:28 ilp1fc85apA4 kernel: [15294.606688] lpfc 0001:00:07.1: 1:(0):0203 Devloss timeout on WWPN 50:05:07:68:02:40:ef:26 NPort x020040 Data: x0 x8 x3
Dec 2 03:16:29 ilp1fc85apA4 kernel: [15294.974597] lpfc 0001:00:07.0: 0:(0):0203 Devloss timeout on WWPN 50:05:07:68:02:30:ef:26 NPort x0b0000 Data: x0 x8 xa
------- Comment From <email address hidden> 2016-05-09 05:21 EDT-------
Kernel updated, the svc ccl case is in progress with 2 loops.
------- Comment From <email address hidden> 2016-05-10 21:14 EDT-------
Completed SVC CCL EI with 2 loops, didn't hit path missing problem.
------- Comment From <email address hidden> 2016-05-11 07:37 EDT-------
Hi Canonical,
The 2 upstream commits that resolve this problem are:
0290217ad830f2813bb9ed5f51af686c0c591f28 lpfc: Correct loss of target discovery after cable swap.
be6bb94100dc6803a530e20aad05360e6267f56b lpfc: Fix premature release of rpi bit in bitmask
------- Comment From <email address hidden> 2016-02-26 13:27 EDT-------
This seems to be the rport reference problem w/ the lpfc driver,
which makes the rport not to be discovered when it's up again,
resolved by this commit [1],
(despite the host numbers being different than those in the multipath -l of the bug report, the timing of the devloss events and the path removal events do match precisely).
[root@iltuc4-bf var_logs]# grep sdz syslog.1 pci0003: 00/0003: 00:0e.5/ host6/rport- 6:0-6/target6: 0:4/6:0: 4:0/block/ sdz' /devices/ pci0003: 00/0003: 00:0e.5/ host6/rport- 6:0-6/target6: 0:4/6:0: 4:0/block/ sdz
<...>
Dec 2 03:16:28 ilp1fc85apA4 multipathd: uevent 'remove' from '/devices/
Dec 2 03:16:28 ilp1fc85apA4 multipathd: DEVNAME=/dev/sdz
Dec 2 03:16:28 ilp1fc85apA4 multipathd: DEVPATH=
Dec 2 03:16:28 ilp1fc85apA4 multipathd: sdz: remove path (uevent)
Dec 2 03:16:28 ilp1fc85apA4 multipathd: sdz: path removed from map mpath9
[root@iltuc4-bf var_logs]# grep sdz syslog.1 pci0001: 00/0001: 00:07.1/ host2/rport- 2:0-7/target2: 0:5/2:0: 5:0/block/ sdak' /devices/ pci0001: 00/0001: 00:07.1/ host2/rport- 2:0-7/target2: 0:5/2:0: 5:0/block/ sdak
<...>
Dec 2 03:16:28 ilp1fc85apA4 multipathd: uevent 'remove' from '/devices/
Dec 2 03:16:28 ilp1fc85apA4 multipathd: DEVNAME=/dev/sdak
Dec 2 03:16:28 ilp1fc85apA4 multipathd: DEVPATH=
Dec 2 03:16:28 ilp1fc85apA4 multipathd: sdak: remove path (uevent)
Dec 2 03:16:29 ilp1fc85apA4 multipathd: sdak: path removed from map mpath4
root@iltuc4-bf var_logs]# grep lpfc syslog.1 68:02:20: ef:26 NPort x5e00a0 Data: x0 x8 x3 68:02:40: ef:26 NPort x020040 Data: x0 x8 x3 68:02:40: ef:26 NPort x020040 Data: x0 x8 x3 68:02:30: ef:26 NPort x0b0000 Data: x0 x8 xa
<...>
Dec 2 03:16:28 ilp1fc85apA4 kernel: [15294.574079] lpfc 0003:00:0e.4: 4:(0):0203 Devloss timeout on WWPN 50:05:07:
Dec 2 03:16:28 ilp1fc85apA4 kernel: [15294.580629] lpfc 0003:00:0e.5: 5:(0):0203 Devloss timeout on WWPN 50:05:07:
Dec 2 03:16:28 ilp1fc85apA4 kernel: [15294.606688] lpfc 0001:00:07.1: 1:(0):0203 Devloss timeout on WWPN 50:05:07:
Dec 2 03:16:29 ilp1fc85apA4 kernel: [15294.974597] lpfc 0001:00:07.0: 0:(0):0203 Devloss timeout on WWPN 50:05:07:
[1] https:/ /git.kernel. org/cgit/ linux/kernel/ git/torvalds/ linux.git/ commit/ drivers/ scsi/lpfc? id=0290217ad830 f2813bb9ed5f51a f686c0c591f28
------- Comment From <email address hidden> 2016-03-03 09:57 EDT-------
Hi Bill Gao,
(In reply to comment #10) manual test before that?
> (In reply to comment #9)
>
> > Is it possible to do a non-scheduled/
>
> Yes, it is.
Great.
I've uploaded a test kernel with 2 patches (comment #4 plus a dependency) to ausgsa. ibm.com/ ~mauricfo/ public/ bugs/bz133798/ v1/
http://
Can you please test whether they resolve the problem?
If they don't, please attach /var/log/syslog and dmesg output.
Thanks!
------- Comment From <email address hidden> 2016-04-25 12:46 EDT-------
Please test with this kernel:
http:// ausgsa. ibm.com/ ~mauricfo/ public/ bugs/bz133798/ v1/
Thanks!
------- Comment From <email address hidden> 2016-05-09 05:21 EDT-------
Kernel updated, the svc ccl case is in progress with 2 loops.
------- Comment From <email address hidden> 2016-05-10 21:14 EDT-------
Completed SVC CCL EI with 2 loops, didn't hit path missing problem.
------- Comment From <email address hidden> 2016-05-11 07:37 EDT-------
Hi Canonical,
The 2 upstream commits that resolve this problem are:
0290217ad830f28 13bb9ed5f51af68 6c0c591f28 lpfc: Correct loss of target discovery after cable swap. 3a530e20aad0536 0e6267f56b lpfc: Fix premature release of rpi bit in bitmask
be6bb94100dc680
Please pull them into 14.04.x.
Thanks!