A VNX LUN will still be recognized as LUNZ after provisioning

Bug #1671397 reported by KC Bi
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Undecided
Unassigned
multipath-tools (Ubuntu)
New
Undecided
Unassigned

Bug Description

1. Attach a Ubuntu 16.04 server to a VNX array through either FC, or iSCSI;
2. LUNZ will be automatically created as below:
# sudo ./inq.LinuxAMD64
Inquiry utility, Version V8.1.1.0 (Edit Level: 2102) built with SYMAPI Version V8.1.1.0 (Edit Level 2102)
Copyright (c) [1997-2015] EMC Corporation. All Rights Reserved.
For help type inq -h.

.......

-------------------------------------------------------------------
DEVICE :VEND :PROD :REV :SER NUM :CAP(kb)
-------------------------------------------------------------------
/dev/sda :Lenovo :720i :4.23 : : 292421632
/dev/sdb :Lenovo :720i :4.23 : : 292421632
/dev/sdc :Lenovo :720i :4.23 : : 292421632
/dev/sdd :Lenovo :720i :4.23 : : 292421632
/dev/sde :Single :Flash Reader :1.00 : : 31166976
/dev/sdf :DGC :LUNZ :0533 :00000000 : FAILED
/dev/sdg :DGC :LUNZ :0533 :00000000 : FAILED

3. Provision 2 x LUNs from VNX to the Ubuntu 16.04 server, then rescan SCSI bus to reflect the changes. The 2 x LUNs will be recognized as below:
# sudo ./inq.LinuxAMD64
Inquiry utility, Version V8.1.1.0 (Edit Level: 2102) built with SYMAPI Version V8.1.1.0 (Edit Level 2102)
Copyright (c) [1997-2015] EMC Corporation. All Rights Reserved.
For help type inq -h.

..........

--------------------------------------------------------------------
DEVICE :VEND :PROD :REV :SER NUM :CAP(kb)
--------------------------------------------------------------------
/dev/sda :Lenovo :720i :4.23 : : 292421632
/dev/sdb :Lenovo :720i :4.23 : : 292421632
/dev/sdc :Lenovo :720i :4.23 : : 292421632
/dev/sdd :Lenovo :720i :4.23 : : 292421632
/dev/sde :Single :Flash Reader :1.00 : : 31166976
/dev/sdf :DGC :VRAID :0533 :BB589458 : 18874368
/dev/sdg :DGC :VRAID :0533 :BB589458 : 18874368
/dev/sdh :DGC :VRAID :0533 :CA589477 : 29360128
/dev/sdi :DGC :VRAID :0533 :CA589477 : 29360128
/dev/dm-0 :DGC :VRAID :0533 :CA589477 : 29360128

4. Based on the output, it can be clearly seen that only one multipath device(dm-0) is created. Since we provisioned 2 x LUNs, 2 x multipath devices should be created. The one not managed by multipath-tools is the LUN with SN BB589458, which re-uses the same native device names as the LUNZ devices(in step 2).

5. Command "multipath -v 3 -ll" will tell the root cause that native devices(sdf and sdg) are still recognized as LUNZ devices:

# sudo multipath -v 3 -ll
......
Mar 09 04:48:42 | sdf: udev property SCSI_IDENT_LUN_VENDOR whitelisted
Mar 09 04:48:42 | sdf: not found in pathvec
Mar 09 04:48:42 | sdf: mask = 0x25
Mar 09 04:48:42 | sdf: dev_t = 8:80
Mar 09 04:48:42 | sdf: size = 0
Mar 09 04:48:42 | sdf: vendor = DGC
Mar 09 04:48:42 | sdf: product = LUNZ
Mar 09 04:48:42 | sdf: rev = 0533
Mar 09 04:48:42 | sdf: h:b:t:l = 33:0:0:0
Mar 09 04:48:42 | sdf: tgt_node_name = iqn.1992-04.com.emc:cx.apm00153919964.a5
Mar 09 04:48:42 | (null): (DGC:LUNZ) vendor/product blacklisted
......
Mar 09 04:48:42 | sdg: udev property SCSI_IDENT_LUN_VENDOR whitelisted
Mar 09 04:48:42 | sdg: not found in pathvec
Mar 09 04:48:42 | sdg: mask = 0x25
Mar 09 04:48:42 | sdg: dev_t = 8:96
Mar 09 04:48:42 | sdg: size = 0
Mar 09 04:48:42 | sdg: vendor = DGC
Mar 09 04:48:42 | sdg: product = LUNZ
Mar 09 04:48:42 | sdg: rev = 0533
Mar 09 04:48:42 | sdg: h:b:t:l = 35:0:0:0
Mar 09 04:48:42 | sdg: tgt_node_name = iqn.1992-04.com.emc:cx.apm00153919964.b5
Mar 09 04:48:42 | (null): (DGC:LUNZ) vendor/product blacklisted
......

6. After provisioning LUNs, LUNZ should be removed, and multipath-tools should reflected the changes.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi,
thank you for your report.
Lacking a VNX I can't right away try to reproduce, but I quickly checked the BIM DS8K I had.
One thing I wondered is that I thought the produce/vendor data there displayed by multipath is only what it got back on a scsi inquiry.
AFAIK - multipath-tools make no decision there "what" it should be.
It certainly has rules that filter/handle different types differently - but as I said I thought it is just the reply to the scsi inquiry to the device that populates the data.

Your report has inq.LinuxAMD64 listing them all as VRAID.
But AFAIK the inq.LinuxAMD64 is an EMC specific tool not packaged up in Ubuntu.

I'd like to understand what the linux tooling would get on those.
Could you report the output of:
for dev in sdf sdg sdh sdi; do sudo sg_inq /dev/${dev}; done

If that is showing VRAID we have to think about stale type data in multipath.
But if that shows LUNZ as well it might be different inquiry handling.

Also I just wanted to check, the two logins both worked the same way - as LUNZs are removed when the real login works there might have been a difference there maybe?

Revision history for this message
KC Bi (bikecheng) wrote :

I really think it is a multipath-tools issue. Here are the comparisons:

ubuntu@osp225219:~$ for dev in sdf sdg sdh sdi; do sudo sg_inq /dev/${dev}; done | grep 'identification'
 Vendor identification: DGC
 Product identification: VRAID
 Vendor identification: DGC
 Product identification: VRAID
 Vendor identification: DGC
 Product identification: VRAID
 Vendor identification: DGC
 Product identification: VRAID
ubuntu@osp225219:~$ sudo ./inq.LinuxAMD64 | grep 'sd[fhgi]'
/dev/sdf :DGC :VRAID :0533 :BB589458 : 18874368
/dev/sdg :DGC :VRAID :0533 :BB589458 : 18874368
/dev/sdh :DGC :VRAID :0533 :CA589477 : 29360128
/dev/sdi :DGC :VRAID :0533 :CA589477 : 29360128
ubuntu@osp225219:~$ sudo multipath -v 3 -ll | grep '\(sdf\|sdg\|sdh\|sdi\)' | grep '\(vendor\|product\)'
Mar 10 08:24:19 | sdf: vendor = DGC
Mar 10 08:24:19 | sdf: product = LUNZ
Mar 10 08:24:19 | sdh: vendor = DGC
Mar 10 08:24:19 | sdh: product = VRAID
Mar 10 08:24:19 | sdg: vendor = DGC
Mar 10 08:24:19 | sdg: product = LUNZ
Mar 10 08:24:19 | sdi: vendor = DGC
Mar 10 08:24:19 | sdi: product = VRAID

BTW, what do you mean by saying "two logins"? I cannot quite understand this.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Just to clarify - two logins for was what you call "Provision 2x LUNs from VNX".
AFAIK the LUNZ is a representation of "what could be" and once provisioned that is replaced by "the real thing". Since in my example that matches an FCP SAN login I might have named it badly for you to realize what I meant - sorry.

Thank you for providing the sq_inq data.
So it really seems multipathd has some stale data from the former /dev/sd* devices.

You said "then rescan SCSI bus to reflect the changes" - since there seem to be multiple ways to do so might I ask what you did in your case to do so?

For the lifetime of that stale data I wanted to ask, if you are in the error case, does one of the following get you back to a good state with all detected as VRAID and multipath set up?
$ service multipathd reload
or
$ service multipathd restart

Revision history for this message
KC Bi (bikecheng) wrote :

1. rescan-scsi-bus.sh is used to rescan SCSI buses;
2. Restart the services won't solve the problem, but a reboot will.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Re: [Bug 1671397] Re: A VNX LUN will still be recognized as LUNZ after provisioning

​Thanks KC,
that makes it interesting.

On a restart it stops the service and starts it again - so there is nothing
the running program could take over other than what is in config files.
Yet on a reboot just the same happens, the difference is that the kernel
might have lost some context on the reboot.

While I still can't reproduce in my SAN env which doesn't have a concept
like the LUNZ->VRAID transition I wonder if we could somehow force your
environment to pick up the new data other than rebooting.

rescan-scsi-bus.sh has various options which might help here, please try
them one by one (ordered by increased potential impact to your system)
--forcerescan (rescan existing devs)
--issue-lip (login reset, not sure if that works on your devs or might
reset your provisioning)
--forceremove (drop and re-add each device)

rescan-scsi-bus.sh should call that (also one of the devices changed for
you), but to make sure this is not taking any path through
rescan-scsi-bus.sh that ends without issuing this you could also run as
root:
$ echo 1 > /sys/block/device_name/device/rescan

Please let me know if any of above four options would get your device
properly detected.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

While I think restarting the service is bigger than that another thing
worth at least to try is if it has an effect is reloading the devmap.
$ multipath -r

Revision history for this message
KC Bi (bikecheng) wrote :

My setup was being used for some other tasks for quite some time. Now, it can be used again. But unfortunately, multipath -r still won't fix the problem.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Nobody else with the same HW showed up, so I'm afraid you have to debug that on your own :-/
If you have support on your storage server those might be able to help - try to loop them in?

Adding a kernel task if that is in any way known by the kernel Team.

@KC
Did you have a chance to test the extra things I asked in comment #5?

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1671397

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.