When a Cinder driver shares the same subsystem for multiple namespaces then os-brick is not able to connect to it, at least with newer nvme client versions (ie: 1.16).
The error we see in the logs is:
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: Command: nvme connect -t tcp -n nqn.nvme-subsystem-1-e31b8c9c-b943-430e-afa4-55a110341dcb -a 192.168.121.58 -s 4420
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: Exit code: 114
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: Stdout: ''
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: Stderr: '' {{(pid=194453) _process_cmd /usr/local/lib/python3.6/site-packages/oslo_privsep/daemon.py:482}}
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: Traceback (most recent call last):
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: File "/usr/local/lib/python3.6/site-packages/oslo_privsep/daemon.py", line 477, in _process_cmd
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: ret = func(*f_args, **f_kwargs)
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: File "/usr/local/lib/python3.6/site-packages/oslo_privsep/priv_context.py", line 274, in _wrap
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: return func(*args, **kwargs)
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: File "/opt/stack/os-brick/os_brick/privileged/rootwrap.py", line 197, in execute_root
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: return custom_execute(*cmd, shell=False, run_as_root=False, **kwargs)
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: File "/opt/stack/os-brick/os_brick/privileged/rootwrap.py", line 146, in custom_execute
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: on_completion=on_completion, *cmd, **kwargs)
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: File "/usr/local/lib/python3.6/site-packages/oslo_concurrency/processutils.py", line 441, in execute
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: cmd=sanitized_cmd)
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: Command: nvme connect -t tcp -n nqn.nvme-subsystem-1-e31b8c9c-b943-430e-afa4-55a110341dcb -a 192.168.121.58 -s 4420
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: Exit code: 114
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: Stdout: ''
Feb 15 16:59:42 localhost.localdomain nova-compute[178960]: Stderr: ''
We can also see this issue now if we reconnect the same LVM volume to the same host due to the nvmeof connector not disconnecting (https://bugs.launchpad.net/os-brick/+bug/1961102).
Fix proposed to branch: master /review. opendev. org/c/openstack /os-brick/ +/836060
Review: https:/