Failures to parse ssacli output on HP Gen9 with a few RAID controllers

Bug #1788434 reported by Anton Antonov
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
proliantutils
Fix Released
Undecided
Anton Antonov

Bug Description

When I tried to use Ironic to deploy Centos on HP Gen9 the deployment failed with an error:

```
| last_error | Agent returned error for clean step {u'interface': u'raid', u'priority': 0, u'step': u'delete_configuration', u'abortable': False} on node 17bc299e-84ef-4be0-bad1-c24ae3af33b5 : {u'message': u"Clean step failed: Error performing clean_step delete_configuration: 'Size'", u'code': 500, u'type': u'CleaningError', u'details': u"Error performing clean_step delete_configuration: 'Size'"}.
```

After checking IPA logs (attached) I discovered that the issue is caused by incorrect parsing of ssacli output. While fixing this issue I found a few other less significant problems. I'm going to create a pull request and add its details to this bug report.

This is a list of issues I fixed in the pull request. For each of the issue below I use a small part of the real output. My pull-request includes unittests which fails before the fix and pass with the fix.

1. This is the main problem. _get_dict function uses recursion to parse ssacli output, but it doesn't return correctly from multi-level depth when a few RAID controllers listed. For example, this output:

```
Smart HBA H240 in Slot 1 (RAID Mode)
   Slot: 1

   Unassigned
      physicaldrive 1I:1:1
         Port: 1I
         Box: 1
         Bay: 1

Smart HBA H240 in Slot 2 (RAID Mode)
   Slot: 2

   Array: H
      Interface Type: SAS
      physicaldrive 2I:2:8
         Port: 2I
         Box: 2
         Bay: 8

Smart HBA H240 in Slot 3 (RAID Mode)
   Slot: 3
```

will be parsed as:

```
{
  "Smart HBA H240 in Slot 1 (RAID Mode)": {
    "Slot": "1",
    "Unassigned": {
      "Smart HBA H240 in Slot 2 (RAID Mode)": {
        "Slot": "2",
        "Array: H": {
          "physicaldrive 2I:2:8": {
            "Box": "2",
            "Port": "2I",
            "Bay": "8"
          },
          "Smart HBA H240 in Slot 3 (RAID Mode)": {
            "Slot": "3"
          },
          "Interface Type": "SAS"
        }
      },
      "physicaldrive 1I:1:1": {
        "Box": "1",
        "Port": "1I",
        "Bay": "1"
      }
    }
  }
}
```

2. Parameters with ':' in name are not properly parsed:

```
Smart HBA H240 in Slot 1 (RAID Mode)
   Bus Interface: PCI
   Slot: 1
   PCI Address (Domain:Bus:Device.Function): 0000:05:00.0
   Controller Mode: RAID Mode
```

result:

```
{
  "Smart HBA H240 in Slot 1 (RAID Mode)": {
    "Slot": "1",
    "Controller Mode": "RAID Mode",
    "Bus Interface": "PCI"
  }
}
```

3. _get_dict function can return premature if it's not in recursion:

```
Smart HBA H240 in Slot 2 (RAID Mode)
   Slot: 2

Smart HBA H240 in Slot 3 (RAID Mode)
   Slot: 3
```

result:

```{
  "Smart HBA H240 in Slot 2 (RAID Mode)": {
    "Slot": "2"
  }
}
```

4. There is a special case in _get_key_value function for "physicaldrive" parameters. But, with the current implementation all drives except last will be lost ("physicaldrive" is used as a key for all of them):

```
Smart HBA H240 in Slot 2 (RAID Mode)
   Slot: 2
   Physical Drives
      physicaldrive 2I:1:5 (port 2I:box 1:bay 5, SAS HDD, 900 GB, OK)
      physicaldrive 2I:1:8 (port 2I:box 1:bay 8, SAS HDD, 900 GB, OK)
```

result:

```
{
  "Smart HBA H240 in Slot 2 (RAID Mode)": {
    "Slot": "2",
    "Physical Drives": {
      "physicaldrive": "2I:1:8"
    }
  }
}
```

5. Maps update done in way that new map replaces an existing one. For example, when physical drives are in different cages:

```Smart HBA H240 in Slot 1 (RAID Mode)
   Slot: 1

   Internal Drive Cage at Port 1I, Box 1, OK
      Drive Bays: 4
      Port: 1I
      Box: 1

   Physical Drives
      physicaldrive 1I:1:4 (port 1I:box 1:bay 4, SAS HDD, 900 GB, OK)
      physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SAS HDD, 900 GB, OK)

   Internal Drive Cage at Port 2I, Box 1, OK
      Drive Bays: 4
      Port: 2I
      Box: 1

   Physical Drives
      physicaldrive 2I:1:5 (port 2I:box 1:bay 5, SAS HDD, 900 GB, OK)
      physicaldrive 2I:1:6 (port 2I:box 1:bay 6, SAS HDD, 900 GB, OK)
```

result:

```
{
  "Smart HBA H240 in Slot 1 (RAID Mode)": {
    "Slot": "1",
    "Physical Drives": {
      "physicaldrive": "2I:1:6"
    },
    "Internal Drive Cage at Port 2I, Box 1, OK": {
      "Box": "1",
      "Drive Bays": "4",
      "Port": "2I"
    },
    "Internal Drive Cage at Port 1I, Box 1, OK": {
      "Box": "1",
      "Drive Bays": "4",
      "Port": "1I"
    }
  }
}
```

Revision history for this message
Anton Antonov (anta-nok) wrote :
Revision history for this message
Anton Antonov (anta-nok) wrote :

Please also find full ssacli output ("ssacli controller all show config detail") from HP Gen9 I used for testing

Revision history for this message
Anton Antonov (anta-nok) wrote :

A pull request with all the fixes is created https://github.com/openstack/proliantutils/pull/1

Anton Antonov (anta-nok)
Changed in proliantutils:
status: New → In Progress
assignee: nobody → Anton Antonov (anta-nok)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to proliantutils (master)

Fix proposed to branch: master
Review: https://review.openstack.org/595835

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to proliantutils (master)

Reviewed: https://review.openstack.org/595835
Committed: https://git.openstack.org/cgit/openstack/proliantutils/commit/?id=fa95cd0218485c420a05cf3bc055ddf559f73297
Submitter: Zuul
Branch: master

commit fa95cd0218485c420a05cf3bc055ddf559f73297
Author: Anton Antonov <email address hidden>
Date: Thu Aug 23 17:50:05 2018 +0100

    Fix ssacli output parsing. New untitest for the fixed issue added.

    The fixed issues are:
    1. _get_dict function doesn't correctly returns from multi-level recursion depth.
    2. Incorrect parsing parameters with ':' in name.
    3. _get_dict function can return premature if it's not in recursion.
    4. "Physical Drives" map si not correctly filled in and updated.
    5. Reduced complexity of _get_dict function

    Details of issues with examples: https://bugs.launchpad.net/proliantutils/+bug/1788434

    Change-Id: Ica8c6e7485e331b083f59eabf2b67e8043d7a90b
    Implements: Fix ssacli output parsing.
    Closes-Bug: #1788434

Changed in proliantutils:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to proliantutils (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/656545

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to proliantutils (stable/rocky)

Reviewed: https://review.opendev.org/656545
Committed: https://git.openstack.org/cgit/x/proliantutils/commit/?id=6fd6bef558d93d494654454b0029cf59aeddc7df
Submitter: Zuul
Branch: stable/rocky

commit 6fd6bef558d93d494654454b0029cf59aeddc7df
Author: Anton Antonov <email address hidden>
Date: Thu Aug 23 17:50:05 2018 +0100

    Fix ssacli output parsing. New untitest for the fixed issue added.

    The fixed issues are:
    1. _get_dict function doesn't correctly returns from multi-level recursion depth.
    2. Incorrect parsing parameters with ':' in name.
    3. _get_dict function can return premature if it's not in recursion.
    4. "Physical Drives" map si not correctly filled in and updated.
    5. Reduced complexity of _get_dict function

    Details of issues with examples: https://bugs.launchpad.net/proliantutils/+bug/1788434

    Change-Id: Ica8c6e7485e331b083f59eabf2b67e8043d7a90b
    Implements: Fix ssacli output parsing.
    Closes-Bug: #1788434
    (cherry picked from commit fa95cd0218485c420a05cf3bc055ddf559f73297)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to proliantutils (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/675557

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to proliantutils (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.opendev.org/675558

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to proliantutils (stable/queens)

Reviewed: https://review.opendev.org/675557
Committed: https://git.openstack.org/cgit/x/proliantutils/commit/?id=28bb9191ea0819f3d00774501d911c5145a0db69
Submitter: Zuul
Branch: stable/queens

commit 28bb9191ea0819f3d00774501d911c5145a0db69
Author: Anton Antonov <email address hidden>
Date: Thu Aug 23 17:50:05 2018 +0100

    Fix ssacli output parsing. New untitest for the fixed issue added.

    The fixed issues are:
    1. _get_dict function doesn't correctly returns from multi-level recursion depth.
    2. Incorrect parsing parameters with ':' in name.
    3. _get_dict function can return premature if it's not in recursion.
    4. "Physical Drives" map si not correctly filled in and updated.
    5. Reduced complexity of _get_dict function

    Details of issues with examples: https://bugs.launchpad.net/proliantutils/+bug/1788434

    Change-Id: Ica8c6e7485e331b083f59eabf2b67e8043d7a90b
    Implements: Fix ssacli output parsing.
    Closes-Bug: #1788434
    (cherry picked from commit fa95cd0218485c420a05cf3bc055ddf559f73297)

tags: added: in-stable-queens
tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to proliantutils (stable/pike)

Reviewed: https://review.opendev.org/675558
Committed: https://git.openstack.org/cgit/x/proliantutils/commit/?id=cfd2686d35e3cb119b2973d7cc9a671bdd8b23a5
Submitter: Zuul
Branch: stable/pike

commit cfd2686d35e3cb119b2973d7cc9a671bdd8b23a5
Author: Anton Antonov <email address hidden>
Date: Thu Aug 23 17:50:05 2018 +0100

    Fix ssacli output parsing. New untitest for the fixed issue added.

    The fixed issues are:
    1. _get_dict function doesn't correctly returns from multi-level recursion depth.
    2. Incorrect parsing parameters with ':' in name.
    3. _get_dict function can return premature if it's not in recursion.
    4. "Physical Drives" map si not correctly filled in and updated.
    5. Reduced complexity of _get_dict function

    Details of issues with examples: https://bugs.launchpad.net/proliantutils/+bug/1788434

    Change-Id: Ica8c6e7485e331b083f59eabf2b67e8043d7a90b
    Implements: Fix ssacli output parsing.
    Closes-Bug: #1788434
    (cherry picked from commit fa95cd0218485c420a05cf3bc055ddf559f73297)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.