Comment 11 for bug 1874270

Revision history for this message
Dan Streetman (ddstreet) wrote :

It looks like the service is failing because your controller is in the process of resetting, which appears to take several minutes. I'm not sure what the design is for nvme-cli tools handling such a long reset time, but my first guess would be to increase the kernel rport timeout, which appears to be around 30 seconds, from the log output. In your hardware's case, it seems like that timeout should be more than 180 seconds.

Apr 07 11:45:10 ICTM1608S01H1 root[2894793]: JD: Resetting controller A
Apr 07 11:45:28 ICTM1608S01H1 kernel: lpfc 0000:af:00.1: 5:(0):6172 NVME rescanned DID x3d0a00 port_state x2
Apr 07 11:45:28 ICTM1608S01H1 kernel: lpfc 0000:18:00.1: 1:(0):6172 NVME rescanned DID x3d0a00 port_state x2
Apr 07 11:45:28 ICTM1608S01H1 kernel: nvme nvme5: NVME-FC{4}: controller connectivity lost. Awaiting Reconnect
Apr 07 11:45:28 ICTM1608S01H1 kernel: nvme nvme1: NVME-FC{0}: controller connectivity lost. Awaiting Reconnect
Apr 07 11:45:28 ICTM1608S01H1 systemd-udevd[2895178]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:45:28 ICTM1608S01H1 systemd-udevd[2895178]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:45:28 ICTM1608S01H1 kernel: nvme nvme5: NVME-FC{4}: io failed due to lldd error 6
Apr 07 11:45:28 ICTM1608S01H1 kernel: nvme nvme1: NVME-FC{0}: io failed due to lldd error 6
Apr 07 11:45:29 ICTM1608S01H1 kernel: lpfc 0000:af:00.0: 4:(0):6172 NVME rescanned DID x011400 port_state x2
Apr 07 11:45:29 ICTM1608S01H1 kernel: lpfc 0000:18:00.0: 0:(0):6172 NVME rescanned DID x011400 port_state x2
Apr 07 11:45:29 ICTM1608S01H1 kernel: nvme nvme4: NVME-FC{1}: controller connectivity lost. Awaiting Reconnect
Apr 07 11:45:29 ICTM1608S01H1 kernel: nvme nvme8: NVME-FC{5}: controller connectivity lost. Awaiting Reconnect
Apr 07 11:45:29 ICTM1608S01H1 systemd-udevd[2895178]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:45:29 ICTM1608S01H1 systemd-udevd[2895178]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:45:29 ICTM1608S01H1 kernel: nvme nvme4: NVME-FC{1}: io failed due to lldd error 6
Apr 07 11:45:29 ICTM1608S01H1 kernel: nvme nvme8: NVME-FC{5}: io failed due to lldd error 6
Apr 07 11:45:59 ICTM1608S01H1 kernel: rport-10:0-9: blocked FC remote port time out: removing rport
Apr 07 11:45:59 ICTM1608S01H1 kernel: rport-16:0-9: blocked FC remote port time out: removing rport
Apr 07 11:45:59 ICTM1608S01H1 kernel: rport-15:0-9: blocked FC remote port time out: removing rport
Apr 07 11:45:59 ICTM1608S01H1 kernel: rport-12:0-9: blocked FC remote port time out: removing rport
Apr 07 11:46:28 ICTM1608S01H1 kernel: nvme nvme5: NVME-FC{4}: dev_loss_tmo (60) expired while waiting for remoteport connectivity.
Apr 07 11:46:28 ICTM1608S01H1 kernel: nvme nvme5: Removing ctrl: NQN "nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2"
Apr 07 11:46:28 ICTM1608S01H1 kernel: nvme nvme1: NVME-FC{0}: dev_loss_tmo (60) expired while waiting for remoteport connectivity.
Apr 07 11:46:28 ICTM1608S01H1 kernel: nvme nvme1: Removing ctrl: NQN "nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2"
Apr 07 11:46:29 ICTM1608S01H1 kernel: nvme nvme4: NVME-FC{1}: dev_loss_tmo (60) expired while waiting for remoteport connectivity.
Apr 07 11:46:29 ICTM1608S01H1 kernel: nvme nvme4: Removing ctrl: NQN "nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2"
Apr 07 11:46:29 ICTM1608S01H1 kernel: nvme nvme8: NVME-FC{5}: dev_loss_tmo (60) expired while waiting for remoteport connectivity.
Apr 07 11:46:29 ICTM1608S01H1 kernel: nvme nvme8: Removing ctrl: NQN "nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2"
Apr 07 11:47:07 ICTM1608S01H1 systemd-udevd[2896874]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:47:07 ICTM1608S01H1 systemd-udevd[2896874]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:47:08 ICTM1608S01H1 systemd-udevd[2896872]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:47:08 ICTM1608S01H1 systemd-udevd[2896874]: fc_udev_device: Process 'systemctl --no-block start nvmf-connect@--device=none\t--transp>
Apr 07 11:49:56 ICTM1608S01H1 root[2899783]: JD: Controller A online
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: nvme-subsys0 - NQN=nqn.1992-08.com.netapp:5700.600a098000d8580e000000005c0136a2
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: \
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: +- nvme2 fc traddr=nn-0x200200a098d8580e:pn-0x202300a098d8580e host_traddr=nn-0x20000090fadcc5ce>
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: +- nvme3 fc traddr=nn-0x200200a098d8580e:pn-0x201300a098d8580e host_traddr=nn-0x200000109b8f2b8d>
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: +- nvme6 fc traddr=nn-0x200200a098d8580e:pn-0x202300a098d8580e host_traddr=nn-0x200000109b8f2b8e>
Apr 07 11:50:04 ICTM1608S01H1 root[2899924]: +- nvme7 fc traddr=nn-0x200200a098d8580e:pn-0x201300a098d8580e host_traddr=nn-0x20000090fadcc5cd>