Removing the failed attempt to rescan specific targets

Bug #1664653 reported by Guy Rozendorn
28
This bug affects 6 people
Affects Status Importance Assigned to Milestone
os-brick
Fix Released
Undecided
Guy Rozendorn

Bug Description

In git commit 28a4d55 LinuxFibreChannel.rescan_hosts attempts
to find specific targets to rescan by looking for
the hba wwnn in the target port_name.

This commit has a number of problems:
* the grep for target* fails many times with the following error:
  [Errno 2] No such file or directory
  this can be seen in many 3rd party cinder CIs
  since the log files are rotated there is no point in giving a link
* the code used to grep the initiaitor wwnn in the target node_name
  that simple cannot happen so even if you didn't get errno 2,
  the function _get_hba_channel_scsi_target would return nothing
  here's an example:
  from c-vol.log:
 2017-02-08 21:25:38.706 6714 ERROR
 os_brick.initiator.linuxfc
 [-] Could not get HBA channel and SCSI target ID,
 path: /sys/class/fc_transport/target10:,
 reason: [Errno 2] No such file or directory
  at looking at sysfs:
    $ ls /sys/class/fc_transport
    target10:0:1
    $ ls /sys/class/fc_transport/target10:0:1
    device node_name port_id port_name power subsystem uevent
    $ cat /sys/class/fc_transport/target10:0:1/node_name
    0x5742b0f000754200
    $ cat /sys/class/fc_transport/*/node_name
    0x5742b0f000754200
    $ cat /sys/class/fc_host/*/node_name
    0x20000000c9952b4c
    $ cat /sys/class/fc_host/*/port_name
    0x10000000c9952b4c
    $ cat /sys/class/fc_transport/*/port_name
    0x5742b0f000754211
    $ cat /sys/class/fc_host/*/port_gname
* calling /sys/class/scsi_host/%s/scan rescans that specific HBA,
  and since the method rescan_hosts doesn't get any info about the
  target except for the lun, it needs to rescan all the targets
  that this hba is connected to

So I removed the method _get_hba_channel_scsi_target from the code,
because it won't work anyway and it is not needed there,
and removed the tests that mock it.

Revision history for this message
Guy Rozendorn (guy-8) wrote :

Some references:
3par
http://54.201.44.218/71/433671/1/check/3par-fc-driver-master-client-pip-c8k01-dsvm/241d930/logs/screen-c-vol.txt.gz?level=ERROR
2017-02-14 16:04:54.527 10189 ERROR os_brick.initiator.linuxfc [-] Could not get HBA channel and SCSI target ID, path: /sys/class/fc_transport/target2:, reason: [Errno 2] No such file or directory
Command: grep 2000d4c9ef7671cd /sys/class/fc_transport/target2:*/node_name
Exit code: -
Stdout: None
Stderr: None
2017-02-14 16:06:12.323 10189 ERROR os_brick.initiator.linuxfc [-] Could not get HBA channel and SCSI target ID, path: /sys/class/fc_transport/target2:, reason: [Errno 2] No such file or directory
Command: grep 2000d4c9ef7671cd /sys/class/fc_transport/target2:*/node_name
Exit code: -
Stdout: None
Stderr: None
2017-02-14 16:07:01.347 10189 ERROR os_brick.initiator.linuxfc [-] Could not get HBA channel and SCSI target ID, path: /sys/class/fc_transport/target2:, reason: [Errno 2] No such file or directory
Command: grep 2000d4c9ef7671cd /sys/class/fc_transport/target2:*/node_name
Exit code: -
Stdout: None
Stderr: None

Revision history for this message
Guy Rozendorn (guy-8) wrote :

dell sc:
2017-02-14 10:55:54.334 2702 ERROR os_brick.initiator.linuxfc [-] Could not get HBA channel and SCSI target ID, path: /sys/class/fc_transport/target2:, reason: [Errno 2] No such file or directory
Command: grep 20000024ff59841b /sys/class/fc_transport/target2:*/node_name
Exit code: -
Stdout: None
Stderr: None
http://oslogs.compellent.com/dell-sc-fc-433671-1/logs/screen-c-vol.log.txt.gz

Revision history for this message
Guy Rozendorn (guy-8) wrote :

storwize
http://dal05.objectstorage.softlayer.net/v1/AUTH_0bd27569-6310-44fc-8a8d-36a112cac4ec/IBM-STORAGE-CI/storwize-tempest-dsvm-full-fc/44/screen-c-vol.txt.gz
2017-02-14 16:37:02.227 5406 ERROR os_brick.initiator.linuxfc [-] Could not get HBA channel and SCSI target ID, path: /sys/class/fc_transport/target2:, reason: [Errno 2] No such file or directory

Revision history for this message
Guy Rozendorn (guy-8) wrote :

fujitsu
http://openstackci.jp.fujitsu.com/Eternusci/89/433289/2/check/fujitsu-eternus-dx-fc/7829f14/logs/screen-c-vol.txt.gz?level=ERROR
2017-02-14 16:08:47.687 9113 ERROR os_brick.initiator.linuxfc [-] Could not get HBA channel and SCSI target ID, path: /sys/class/fc_transport/target2:, reason: [Errno 2] No such file or directory
Command: grep 20000090fa73317e /sys/class/fc_transport/target2:*/node_name
Exit code: -
Stdout: None
Stderr: None

Revision history for this message
Guy Rozendorn (guy-8) wrote :

kaminario
http://54.209.116.144/66/433166/2/check/kaminario-dsvm-tempest-full-FC/ac331a1/logs/screen-c-vol.txt.gz?level=ERROR
2017-02-14 06:21:21.026 10041 ERROR os_brick.initiator.linuxfc [-] Could not get HBA channel and SCSI target ID, path: /sys/class/fc_transport/target2:, reason: [Errno 2] No such file or directory
Command: grep 20000024ff34c92d /sys/class/fc_transport/target2:*/node_name
Exit code: -
Stdout: None
Stderr: None

affects: cinder → os-brick
Changed in os-brick:
assignee: nobody → Guy Rozendorn (guy-8)
Revision history for this message
Guy Rozendorn (guy-8) wrote :

there are arrays that implement the same wwnn for all ports, but code looks for the *initiator wwnn* but target*/node_name contains the *target wwnn*
The grep will not find anything in any FC array, including the ones that use the same wwnn for all target ports
and specifically, our (INFINIDAT) array uses the same wwnn for all the ports and the code fails to find the target

Revision history for this message
Chhavi Agarwal (chhagarw) wrote :

I faced the similar issues while verifying on the SVC, instead of removing the get_hba_scsi_target it will be better to make the below changes,
a. Code should look for the target wwn instead of initiator wwn. Because fc_transport will only contains Target info.
b. Instead of grep, it should use grep -l, because in order to find the CTL we need the matching file path, which is splitted in the returned for loop. For the point b, I have one bug opened
https://bugs.launchpad.net/os-brick/+bug/1684996

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on os-brick (master)

Change abandoned by Sean McGinnis (<email address hidden>) on branch: master
Review: https://review.openstack.org/433666
Reason: This review is > 4 weeks without comment and currently blocked by a core reviewer with a -2. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and contacting the reviewer with the -2 on this review to ensure you address their concerns.

Revision history for this message
Alexander Bozhenko (alexbozhenko) wrote :

Guy, did you solve the issue for your environment somehow?

Revision history for this message
Guy Rozendorn (guy-8) wrote : Re: [Bug 1664653] Re: Removing the failed attempt to rescan specific targets

I did not.
There’s no workaround for this — this piece of code doesn’t work and its
not necessary.

Revision history for this message
Gorka Eguileor (gorka) wrote :

Patch submitted to fix this: https://review.openstack.org/520052

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (master)

Reviewed: https://review.openstack.org/520052
Committed: https://git.openstack.org/cgit/openstack/os-brick/commit/?id=4ee404466d3f2ced8bfdfb18927ef19a27967952
Submitter: Zuul
Branch: master

commit 4ee404466d3f2ced8bfdfb18927ef19a27967952
Author: Gorka Eguileor <email address hidden>
Date: Wed Nov 8 21:03:08 2017 +0100

    Fixing FC scanning

    Current FC tries to limit the scanning range by detecting the target and
    channel, unfortunately this code has a good number of implementation
    issues:

    - Matching uses local WWNN instead of target's WWPN.
    - Not using a shell to run the command, so the * glob won't expand.
    - Not using -l on grep command to list file names instead of contents.
    - Not making the search case insensitive.

    This patch fixes all these issues by using the target's WWPNs instead
    -taking into account FC Zone/Access control information if present- and
    supporting both possible connection information formats for the WWPNs
    (single value or list of values).

    Rescan tests have been modified to adhere to unit tests best practices,
    where each test case only tests the specific code in the method under
    test and mocks everything else.

    Closes-Bug: #1664653
    Closes-Bug: #1684996
    Closes-Bug: #1687607
    Change-Id: Ib539f6a3652bab4399c30cd90f326829e839ec02

Changed in os-brick:
status: New → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 2.3.0

This issue was fixed in the openstack/os-brick 2.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/622348

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (stable/pike)

Reviewed: https://review.openstack.org/622348
Committed: https://git.openstack.org/cgit/openstack/os-brick/commit/?id=f821a87ef0fd59cb1b48a85eed16b1f3f16e7467
Submitter: Zuul
Branch: stable/pike

commit f821a87ef0fd59cb1b48a85eed16b1f3f16e7467
Author: Gorka Eguileor <email address hidden>
Date: Wed Nov 8 21:03:08 2017 +0100

    Fixing FC scanning

    Current FC tries to limit the scanning range by detecting the target and
    channel, unfortunately this code has a good number of implementation
    issues:

    - Matching uses local WWNN instead of target's WWPN.
    - Not using a shell to run the command, so the * glob won't expand.
    - Not using -l on grep command to list file names instead of contents.
    - Not making the search case insensitive.

    This patch fixes all these issues by using the target's WWPNs instead
    -taking into account FC Zone/Access control information if present- and
    supporting both possible connection information formats for the WWPNs
    (single value or list of values).

    Rescan tests have been modified to adhere to unit tests best practices,
    where each test case only tests the specific code in the method under
    test and mocks everything else.

    Closes-Bug: #1664653
    Closes-Bug: #1684996
    Closes-Bug: #1687607
    Change-Id: Ib539f6a3652bab4399c30cd90f326829e839ec02
    (cherry picked from commit 4ee404466d3f2ced8bfdfb18927ef19a27967952)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 1.15.7

This issue was fixed in the openstack/os-brick 1.15.7 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.