Liang's comment in patch set 3 [1] can explain your problem, he said:
The dev can disappear momentarily right after 'multipath -r "dev"'. So it doesn't happen for every single path. If it did, it would cause a lot more issues. The multipath dev removal path reloads the dev near the beginning of the operation (_rescan_multipath). Thus the "stat" here can fail if it is executed before the dev node being re-created.
1, 'multipath -r' in _rescan_multipath() [10] can make the multipath dev disappear momentarily [9] due to the bug [9].
2, _get_multipath_device_name() [2] uses 'multipath -ll' command to find multipath device name [3], so we saw:
Jan 25 09:24:40 Lock "connect_volume" acquired by "os_brick.initiator.connector.disconnect_volume" :: waited 0.000s
...
Jan 25 09:24:40 multipath ['-ll', u'/dev/sdr']: stdout=360080e5000297ea40000050658885f45 dm-6 NETAPP,INF-01-00#012
3, _linuxscsi.remove_multipath_device() [4] will invoke remove_multipath_device() [4], so we saw:
Jan 25 09:24:40 remove multipath device /dev/sdr'
4, then find_multipath_device() will be invoked [5], then 'multipath -l' will be invoked [6]
5, the "stat" right after 'multipath -r' here [7] can fail if it is executed before the dev node being re-created. so we saw:
Jan 25 09:24:40 Couldn't find multipath device /dev/mapper/360080e5000297ea40000050658885f45
So the fix [1] was trying to fix this problem, but it was abandoned later because we already have the fix [8], that's also why I am trying to backport it.
FYI, the root cause of your problem is a bug in multipath-tools [9], you can also fix the problem by upgrading multipath-tools.
@Gustavo,
Liang's comment in patch set 3 [1] can explain your problem, he said:
The dev can disappear momentarily right after 'multipath -r "dev"'. So it doesn't happen for every single path. If it did, it would cause a lot more issues. The multipath dev removal path reloads the dev near the beginning of the operation (_rescan_ multipath) . Thus the "stat" here can fail if it is executed before the dev node being re-created.
1, 'multipath -r' in _rescan_multipath() [10] can make the multipath dev disappear momentarily [9] due to the bug [9].
2, _get_multipath_ device_ name() [2] uses 'multipath -ll' command to find multipath device name [3], so we saw:
Jan 25 09:24:40 Lock "connect_volume" acquired by "os_brick. initiator. connector. disconnect_ volume" :: waited 0.000s 360080e5000297e a40000050658885 f45 dm-6 NETAPP, INF-01- 00#012
...
Jan 25 09:24:40 multipath ['-ll', u'/dev/sdr']: stdout=
3, _linuxscsi. remove_ multipath_ device( ) [4] will invoke remove_ multipath_ device( ) [4], so we saw:
Jan 25 09:24:40 remove multipath device /dev/sdr'
4, then find_multipath_ device( ) will be invoked [5], then 'multipath -l' will be invoked [6]
5, the "stat" right after 'multipath -r' here [7] can fail if it is executed before the dev node being re-created. so we saw:
Jan 25 09:24:40 Couldn't find multipath device /dev/mapper/ 360080e5000297e a40000050658885 f45
So the fix [1] was trying to fix this problem, but it was abandoned later because we already have the fix [8], that's also why I am trying to backport it.
FYI, the root cause of your problem is a bug in multipath-tools [9], you can also fix the problem by upgrading multipath-tools.
[1] https:/ /review. openstack. org/#/c/ 366065 /github. com/openstack/ os-brick/ blob/stable/ mitaka/ os_brick/ initiator/ connector. py#L925 /github. com/openstack/ os-brick/ blob/stable/ mitaka/ os_brick/ initiator/ connector. py#L1200 /github. com/openstack/ os-brick/ blob/stable/ mitaka/ os_brick/ initiator/ connector. py#L935 /github. com/openstack/ os-brick/ blob/stable/ mitaka/ os_brick/ initiator/ linuxscsi. py#L124 /github. com/openstack/ os-brick/ blob/stable/ mitaka/ os_brick/ initiator/ linuxscsi. py#L263 /github. com/openstack/ os-brick/ blob/stable/ mitaka/ os_brick/ initiator/ linuxscsi. py#L288 /review. openstack. org/#/c/ 374421/ /bugs.launchpad .net/ubuntu/ +source/ multipath- tools/+ bug/1621340 /github. com/openstack/ os-brick/ blob/stable/ mitaka/ os_brick/ initiator/ connector. py#L918
[2] https:/
[3] https:/
[4] https:/
[5] https:/
[6] https:/
[7] https:/
[8] https:/
[9] https:/
[10] https:/