system host-disk-wipe does not update disk-partition-list inventory (only updated available_gib on the disk-list)

Bug #1844363 reported by Wendy Mitchell
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Confirmed
Low
Unassigned

Bug Description

Brief Description
-----------------
system host-disk-wipe returns disk available size but the partition is still listed
(does not report disk change back to the conductor to update inventory)

Severity
--------
Major: cmd doesn't work

Steps to Reproduce
------------------
1. Lock controller-1 (optional)
2. system host-disk-list controller-1
This is one of the disks with 447.128 available_gib

| 8f58bae7-b2ad-4dc6-b702-05ed982435ee | /dev/sdd | 2096 | SSD | 447. | 447.128 | N/A | PHWA542200SY | /dev/disk/by-path/pci-0000:00:1f |
| | | | | 13 | | | 480FGN | .2-ata-5.0

3. Create a new partition using this disk eg. size 40
As expected the available_gib is reduced to 407.128

$ system host-disk-partition-list controller-1
...
+--------------------------------------+-----------------------------+----------------+--------------------------------------+----------------- ....
8f58bae7b2ad-4dc6-b702-05ed982435ee | /dev/sdd | 2096 | SSD | 447. | 407.128 | N/A | PHWA542200SY | /dev/dis
| | | | | 13 | | | 480FGN | .2-ata-5
| | | | | | | | |

system host-disk-partition-list controller-1
uuid | device_path | device_nod | type_guid | type_nam | size_ | status |
| | | e | | e | gib | |
+--------------------------------------+-----------------------------+------------+--------------------------------------+----------+-------+--------+
| bb74e25a-5083-441c-b3b9-d768c1d18a84 | /dev/disk/by-path/pci-0000: | /dev/sdd1 | ba5eba11-0000-1111-2222-000000000001 | LVM | 40.0 | Ready |
| | 00:1f.2-ata-5.0-part1 | | | Physical | | |
| | | | | Volume |

4. Perform a wipe disk operation on the disk
$ system host-disk-wipe controller-1 8f58bae7-b2ad-4dc6-b702-05ed982435ee
WARNING: This operation is irreversible and all data on the specified disk will be lost.
Continue [yes/N]: yes
None

5. Notice that the available_gib have returned to 447.128 as expected.

$ system host-disk-list controller-1
| uuid | device_node | device_num | device_type | size_gib | available_gib | rpm | serial_id | device_path
...
| 8f58bae7-b2ad-4dc6-b702-05ed982435ee | /dev/sdd | 2096 | SSD | 447. | 447.128 | N/A | PHWA542200SY | /dev/disk/by-path/pci-0000:00:1f |
| | | | | 13 | | | 480FGN | .2-ata-5.0 |
| | | | | | | | |

6. However, the disk partition is still reported as there (with size 40.0)

$ system host-disk-partition-list controller-1
+--------------------------------------+-----------------------------------+-------------+--------------------------------------+---------------------+----------+--------+
| uuid | device_path | device_node | type_guid | type_name | size_gib | status |
+--------------------------------------+-----------------------------------+-------------+--------------------------------------+---------------------+----------+--------+
| bb74e25a-5083-441c-b3b9-d768c1d18a84 | /dev/disk/by-path/pci-0000:00:1f. | /dev/sdd1 | ba5eba11-0000-1111-2222-000000000001 | LVM Physical Volume | 40.0 | Ready |
| | 2-ata-5.0-part1 | | | | | |
| | | | | | | |
+--------------------------------------+-----------------------------------+-------------+--------------------------------------+---------------------
[sysadmin@controller-0 ~(keystone_admin)]$ date
Tue Sep 17 14:36:42 UTC 2019

Expected Behavior
------------------
Expected the partition should no longer be listed in the inventory
(disk wipe and disk change should be reported back to the conductor to update the inventory)

Actual Behavior
----------------
Disk wipe without update disk partition inventory

Reproducibility
---------------
yes

System Configuration
--------------------
hw: 2+2+n
PV-1
wcp_113-121

Branch/Pull Time/Commit
-----------------------
master as of 2019-09-11_20-00-00

Last Pass
---------
n/a

Timestamp/Logs
--------------
~14:36:40

Test Activity
-------------

description: updated
Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :
tags: added: stx.retestneeded
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as stx.3.0 / medium priority - disk-specific cmd not working.

description: updated
tags: added: stx.3.0 stx.storage
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
assignee: nobody → Cindy Xie (xxie1)
Cindy Xie (xxie1)
Changed in starlingx:
assignee: Cindy Xie (xxie1) → Tingjie Chen (silverhandy)
Revision history for this message
Tingjie Chen (silverhandy) wrote :
Download full text (5.8 KiB)

I have tried the image: 20191120T023000Z, but cannot reproduce the issue.
@Wendy, can you varify it with latest image?

$ system host-disk-list compute-0
+--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+-----------+--------------------------------------------+
| uuid | device_node | device_num | device_type | size_gib | available_gib | rpm | serial_id | device_path |
+--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+-----------+--------------------------------------------+
| cb52cb30-fd73-4ec3-b852-7cde53794cb5 | /dev/sda | 2048 | HDD | 200.0 | 100.976 | Undetermined | QM00005 | /dev/disk/by-path/pci-0000:00:03.0-ata-1.0 |
| cd28624c-a15a-4b4a-9fa4-f65056d08667 | /dev/sdb | 2064 | HDD | 30.0 | 29.997 | Undetermined | QM00007 | /dev/disk/by-path/pci-0000:00:03.0-ata-2.0 |

$ system host-disk-partition-list compute-0
+--------------------------------------+--------------------------------------------------+-------------+--------------------------------------+---------------------+----------+--------+
| uuid | device_path | device_node | type_guid | type_name | size_gib | status |
+--------------------------------------+--------------------------------------------------+-------------+--------------------------------------+---------------------+----------+--------+
| abdd328d-a1ee-44de-91e1-08c28f93f1fe | /dev/disk/by-path/pci-0000:00:03.0-ata-1.0-part5 | /dev/sda5 | ba5eba11-0000-1111-2222-000000000001 | LVM Physical Volume | 10.0 | In-Use |
| 45316577-2aa3-4e61-8184-2144e3178265 | /dev/disk/by-path/pci-0000:00:03.0-ata-2.0-part1 | /dev/sdb1 | ba5eba11-0000-1111-2222-000000000001 | LVM Physical Volume | 10.0 | Ready |
+--------------------------------------+--------------------------------------------------+-------------+--------------------------------------+---------------------+----------+--------+

# After create partition 10G, the sdb available size decrease into 19.997 GB.
[sysadmin@controller-0 ~(keystone_admin)]$ system host-disk-list compute-0
+--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+-----------+--------------------------------------------+
| uuid | device_node | device_num | device_type | size_gib | available_gib | rpm | serial_id | device_path |
+--------------------------------------+-------------+------------+-------------+----------+---------------+--------------+-----------+--------------------------------------------+
| cb52cb30-fd73-4ec3-b852-7cde53794cb5 | /dev/sda | 2048 | HDD | 200.0 | 100.976 | Undetermined | QM00005 | /dev/disk/by-path/pci-0000:00:03.0-ata-1.0 |
| cd28624c-a15a-4b4a-9fa4-f65056d08667 | /dev/sdb | 2064 | HDD | 30.0 |...

Read more...

Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :
Download full text (11.9 KiB)

Retested on wcp 92-98 (2019-11-22_20-00-00) with NVME disks (2 controller, 2 storage, 3 worker system). Seems okay. It seems to be doing a inventory update --- at least within a minute after the wipe disk operation (see below).

---> I will also be retrying shortly with the same lab and disks types as reported above.

$ system host-disk-list controller-1
+--------------------------------------+--------------+---------+---------+-------+------------+-----+----------+-------------------------------------------+
| uuid | device_node | device_ | device_ | size_ | available_ | rpm | serial_i | device_path |
| | | num | type | gib | gib | | d | |
+--------------------------------------+--------------+---------+---------+-------+------------+-----+----------+-------------------------------------------+
| 8b612441-f692-4e3d-af8e-f7d56abeb8bc | /dev/nvme0n1 | 66304 | NVME | 372. | 0.0 | N/A | CVFT6092 | /dev/disk/by-path/pci-0000:85:00.0-nvme-1 |
| | | | | 611 | | | 003D400G | |
| | | | | | | | GN | |
| | | | | | | | | |
| 6bb534ea-b3b5-44e3-86c3-4bdfe88b51f6 | /dev/nvme1n1 | 66309 | NVME | 372. | 372.609 | N/A | CVFT6092 | /dev/disk/by-path/pci-0000:86:00.0-nvme-1 |
| | | | | 611 | | | 006W400G | |
| | | | | | | | GN | |
| | | | | | | | | |
+--------------------------------------+--------------+---------+---------+-------+------------+-----+----------+-------------------------------------------+
[sysadmin@controller-0 ~(keystone_admin)]$ system host-disk-partition-list controller-1
+--------------------------------------+-----------------------------+----------------+--------------------------------------+--------+----------+----------+
| uuid | device_path | device_node | type_guid | type_n | size_gib | status |
| | | | | ame | | |
+--------------------------------------+-----------------------------+----------------+--------------------------------------+--------+----------+----------+
| 18c97ebb-c890-486e-8ee0-024fac535d99 | /dev/disk/by-path/pci-00...

Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :
Download full text (10.2 KiB)

verified also that this is now working (inside of a minute) on PV-1

ie. the partition is no longer listed in the inventory within a minute of the disk wipe

$ system host-disk-list controller-1
+--------------------------------------+-----------+---------+---------+-------+------------+------+-------------------+--------------------------------------------+
| uuid | device_no | device_ | device_ | size_ | available_ | rpm | serial_id | device_path |
| | de | num | type | gib | gib | | | |
+--------------------------------------+-----------+---------+---------+-------+------------+------+-------------------+--------------------------------------------+
| 81643fea-84a0-4782-9620-150b7d2b120b | /dev/sda | 2048 | HDD | 931. | 931.51 | 7200 | Z1W4GJYT | /dev/disk/by-path/pci-0000:00:1f.2-ata-1.0 |
| | | | | 512 | | | | |
| | | | | | | | | |
| e4414140-298a-459a-aaf7-37f06e131718 | /dev/sdb | 2064 | SSD | 447. | 0.0 | N/A | PHWA60620031480FG | /dev/disk/by-path/pci-0000:00:1f.2-ata-2.0 |
| | | | | 13 | | | N | |
| | | | | | | | | |
| 28476e4b-7690-4fae-91ce-cee6f64870c1 | /dev/sdc | 2080 | SSD | 745. | 745.209 | N/A | BTWA542304ET800HG | /dev/disk/by-path/pci-0000:00:1f.2-ata-3.0 |
| | | | | 211 | | | N | |
| | | | | | | | | |
| 0e276541-5246-4ec8-8950-262750247e9a | /dev/sdd | 2096 | SSD | 447. | 407.128 | N/A | PHWA542200SY480FG | /dev/disk/by-path/pci-0000:00:1f.2-ata-5.0 |
| | | | | 13 | | | N | |
| | | | | | | | | |
+--------------------------------------+-----------+---------+---------+-------+------------+------+-------------------+--------------------------------------------+
[sysadmin@controller-0 ~(keystone_admin)]$ system host-disk-partition-list controller-1
+--------------------------------------+------------------------------------------+---------...

tags: removed: stx.3.0 stx.retestneeded
Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :

PV-1, 2019-11-22_20-00-00

Revision history for this message
Austin Sun (sunausti) wrote :

Wendy:

Thanks your confirm, could we close this bug now ?

Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :

Yes, consider it closed now.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Closing. Issue is not reproducible in recent loads.

Changed in starlingx:
status: Triaged → Invalid
Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :
Download full text (8.3 KiB)

I am now able to reproduce this issue.
2020-05-22_20-00-00

$ system host-list
+----+--------------+-------------+----------------+-------------+--------------+
| id | hostname | personality | administrative | operational | availability |
+----+--------------+-------------+----------------+-------------+--------------+
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | compute-2 | worker | unlocked | enabled | available |
| 3 | compute-1 | worker | unlocked | enabled | available |
| 4 | storage-0 | storage | unlocked | enabled | available |
| 5 | compute-0 | worker | unlocked | enabled | available |
| 6 | storage-1 | storage | unlocked | enabled | available |
| 7 | controller-1 | controller | unlocked | enabled | available

1. Created 2 partitions (one 11 the other 9 size_gb) using the disk on conroller-1

$ system host-disk-partition-list controller-1
+--------------------------------------+-----------------------------+----------------+--------------------------------------+----------+-------+--------+
| uuid | device_path | device_node | type_guid | type_nam | size_ | status |
| | | | | e | gib | |
+--------------------------------------+-----------------------------+----------------+--------------------------------------+----------+-------+--------+
| 45d21e67-6f8d-4cc2-abcf-f315454f9b61 | /dev/disk/by-path/pci-0000: | /dev/nvme1n1p1 | ba5eba11-0000-1111-2222-000000000001 | LVM | 11.0 | Ready |
| | 86:00.0-nvme-1-part1 | | | Physical | | |
| | | | | Volume | | |
| | | | | | | |
| e4610dc0-10be-447d-8e3c-0237c3667f5a | /dev/disk/by-path/pci-0000: | /dev/nvme1n1p2 | ba5eba11-0000-1111-2222-000000000001 | LVM | 9.0 | Ready |
| | 86:00.0-nvme-1-part2 | | | Physical | | |
| | | | | Volume | | |
| | | | | | |

The size_gib reduced down to 352.609 as expected.

[sysadmin@controller-1 ~(keystone_admin)]$ system host-disk-list controller-1
+--------------------------------------+--------------+---------+---------+-------+------------+-----+----------+--------------------------------------+
| uuid ...

Read more...

Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :
Changed in starlingx:
status: Invalid → Confirmed
tags: added: stx.retestneeded
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Lowering the priority as a user would only use wipe-disk before re-installing the node. This may only be a display issue.

Changed in starlingx:
importance: Medium → Low
assignee: Tingjie Chen (silverhandy) → nobody
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Unassigning since the current assignee has not been active since Nov 2019

Revision history for this message
Wendy Mitchell (wmitchellwr) wrote :

Definitely not just a display issue as noted in my comment above. cli output shows that of the partitions created, only the most recent one was removed from the host-disk-partition-list.

Long after the wipedisk operation, the 1st partition created (of the 2) still has not been removed from the inventory.

$ system host-disk-partition-list controller-1
+--------------------------------------+-----------------------------+----------------+--------------------------------------+----------+-------+--------+
| uuid | device_path | device_node | type_guid | type_nam | size_ | status |
| | | | | e | gib | |
+--------------------------------------+-----------------------------+----------------+--------------------------------------+----------+-------+--------+
| 45d21e67-6f8d-4cc2-abcf-f315454f9b61 | /dev/disk/by-path/pci-0000: | /dev/nvme1n1p1 | ba5eba11-0000-1111-2222-000000000001 | LVM | 11.0 | Ready |
| | 86:00.0-nvme-1-part1 | | | Physical | | |
| | | | | Volume | | |
| | | | |

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Added stx.5.0 tag on this LP for now. Further investigation is required. But given this was also seen in stx.3.0, it will not hold up the stx.4.0 upcoming release.

tags: added: stx.5.0
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Lowering the priority as discussed with Frank Miller, storage prime.

tags: removed: stx.5.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.