MPATH DC system upgrade: disk partition exceptions in sysinv.log

Bug #2048795 reported by Lucas Borges
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Lucas Borges

Bug Description

Brief Description

Noticed some exceptions in sysinv.log on upgrade related to DiskNotFound

The upgrade appears to be left in this state
sysinv 2023-03-28 17:09:52.533 158668 INFO sysinv.conductor.manager [-] Upgrade in progress - defer platform managed application activity
sysinv 2023-03-28 17:10:52.534 158668 INFO sysinv.conductor.manager [-] Upgrade in progress - defer platform managed application activity
...
sysinv 2023-03-28 19:30:53.004 158668 INFO sysinv.conductor.manager [-] Upgrade in progress - defer platform managed application activity

Severity

<Minor: System/Feature is usable with minor issue>

Steps to Reproduce

HPE

(Note: Partitions had been created on the controllers prior to upgrade I believe)

[Upgrade controller-0 here]
2023-03-28T16:34:23.148 controller-1 -bash: info HISTORY: PID=166700 UID=42425 system host-upgrade controller-0

[Looks like unlock of controller here]
2023-03-28T17:00:30.377 controller-1 -bash: info HISTORY: PID=166700 UID=42425 system host-unlock controller-0

Expected Behavior

Expect upgrade without errors in sysinv.log

Actual Behavior

[Error in sysinv.log here]

96d24e24845f
sysinv 2023-03-28 17:08:47.247 103812 INFO sysinv.agent.manager [-] Sysinv Agent platform update by host:
{'availability': 'available', 'config_applied': '6e9e0d4a-0ea2-4b80-a46e-4470b9719df4', 'iscsi_initiator_name': 'iqn.1993-08.org.debian:01:96d24e24845f', 'first_report': True}

sysinv 2023-03-28 17:08:50.143 103812 ERROR sysinv.openstack.common.periodic_task [-] Error during AgentManager._inventory_audit: Remote error: DiskNotFound No disk with id 46fd21bf-4407-491d-9870-7fcf057dfe89
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/sysinv/db/sqlalchemy/api.py", line 2854, in _partition_get
result = query.one()
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/query.py", line 3500, in one
raise orm_exc.NoResultFound("No row was found for one()")
sqlalchemy.orm.exc.NoResultFound: No row was found for one()

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/sysinv/conductor/manager.py", line 5003, in ipv_update_by_ihost
partition = self.dbapi.partition_get(
....

sysinv.logsysinv 2023-03-28 17:08:49.781 158668 INFO sysinv.conductor.manager [-] Partition create on host: 1. Details:
{'forihostid': 1, 'status': 1, 'device_node': '/dev/mapper/mpathb-part1', 'device_path': '/dev/disk/by-id/wwn-0x60002ac0000000000000000800029aaa-part1', 'start_mib': '1', 'end_mib': '581633', 'size_mib': '581632', 'type_guid': 'ba5eba11-0000-1111-2222-000000000001', 'type_name': 'LVM Physical Volume', 'uuid': '8f5e7cb1-ccc1-4055-b997-f92bbadd812c', 'idisk_id': 11, 'idisk_uuid': '7d626936-6cf8-419d-8a9b-20fced07e07c'}

sysinv 2023-03-28 17:08:49.961 158668 INFO sysinv.conductor.manager [-] LVG uuid: 4e47e388-ce5b-4ca0-ab25-5ab2c142cdd5 changed UUID from hzFevZ-dY92-miB4-Z5Bl-GfU2-4znE-qZ4mmL to fpqOK8-Y92q-yEid-ZtGt-Ks8A-x3pn-D9498C
sysinv 2023-03-28 17:08:50.140 158668 ERROR zerorpc.core [-] : sysinv.common.exception.DiskNotFound: No disk with id 46fd21bf-4407-491d-9870-7fcf057dfe89
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core Traceback (most recent call last):
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core File "/usr/lib/python3/dist-packages/sysinv/db/sqlalchemy/api.py", line 2854, in _partition_get
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core result = query.one()
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core File "/usr/lib/python3/dist-packages/sqlalchemy/orm/query.py", line 3500, in one
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core raise orm_exc.NoResultFound("No row was found for one()")
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core sqlalchemy.orm.exc.NoResultFound: No row was found for one()
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core During handling of the above exception, another exception occurred:
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core Traceback (most recent call last):
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core File "/usr/lib/python3/dist-packages/sysinv/conductor/manager.py", line 5003, in ipv_update_by_ihost
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core partition = self.dbapi.partition_get(
2023-03-28 17:08:50.140 158668 ERROR zerorpc.core File "/usr/lib/python3/dist-packages/sysinv/db/sqlalchemy/objects.py", line 29, in wrapper

Reproducibility

<Reproducible/Intermittent/Seen once>

State if the issue is 100% reproducible, intermittent or seen once. If it is intermittent, state the frequency of occurrence

Last Pass

TBD

Timestamp/Logs
see times inline above

Alarms
$ fm alarm-list
-------------------------------------------------------------------------------------------------------------------------
Alarm ID Reason Text Entity ID Severity Time Stamp

-------------------------------------------------------------------------------------------------------------------------
280.002 subcloud1 load sync_status is out-of-sync subcloud=subcloud1. major 2023-03-28T17
      resource=load :32:23.376003

750.006 A configuration change requires a reapply of the cert-manager k8s_application= warning 2023-03-28T17
   application. cert-manager :09:01.352747

750.006 A configuration change requires a reapply of the platform-integ-apps k8s_application= warning 2023-03-28T17
   application. platform-integ-apps :09:01.133454

500.101 Developer patch certificate is enabled host=controller critical 2023-03-28T16
            :32:15.886883

$ system host-disk-list controller-0
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
uuid device_node device_num device_type size_gib available_gib rpm serial_id device_path

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
18a50617-16bc-4640-9781-e0755c52e50a /dev/mapper/mpatha 64768 SSD 1229.0 0.0 N/A 360002ac0000000000000000700029aaa /dev/disk/by-id/wwn-0x60002ac0000000000000000700029aaa
7d626936-6cf8-419d-8a9b-20fced07e07c /dev/mapper/mpathb 64773 SSD 2867.0 2298.981 N/A 360002ac0000000000000000800029aaa /dev/disk/by-id/wwn-0x60002ac0000000000000000800029aaa
9e8aab0a-1f3a-4386-bde1-e9ade337372e /dev/mapper/mpathc 64775 SSD 1024.0 0.0 N/A 360002ac0000000000000001d00029aaa /dev/disk/by-id/wwn-0x60002ac0000000000000001d00029aaa

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[sysadmin@controller-1 log(keystone_admin)]$ system host-disk-list controller-1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
uuid device_node device_num device_type size_gib available_gib rpm serial_id device_path

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
8c62f9e2-c49a-43d1-aa9f-c890b7c19ccc /dev/mapper/mpatha 64768 SSD 1229.0 0.0 N/A 360002ac0000000000000000900029aaa /dev/disk/by-id/wwn-0x60002ac0000000000000000900029aaa
166f228f-400a-4789-8536-83497f06293c /dev/mapper/mpathb 64773 SSD 2867.0 2298.981 N/A 360002ac0000000000000000a00029aaa /dev/disk/by-id/wwn-0x60002ac0000000000000000a00029aaa
a546f167-429b-4a25-a46c-87ba94ef0e81 /dev/mapper/mpathc 64775 SSD 1024.0 0.0 N/A 360002ac0000000000000001e00029aaa /dev/disk/by-id/wwn-0x60002ac0000000000000001e00029aaa

---------------------------------------------------------------------------
[sysadmin@controller-1 log(keystone_admin)]$ system host-disk-partition-list controller-0
System platform upgrade is in progress.
The command may display the target configuration that has not yet been applied to the host.
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
uuid device_path device_node type_guid type_name size_gib status

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
baebfc2e-488e-49ed-9d33-0c413aed81ad /dev/disk/by-id/wwn-0x60002ac0000000000000000700029aaa-part4 /dev/mapper/mpatha-part4 e6d6d379-f507-44c2-a23c-238f2a3df928 Linux LVM 1197.409 In-Use
8f5e7cb1-ccc1-4055-b997-f92bbadd812c /dev/disk/by-id/wwn-0x60002ac0000000000000000800029aaa-part1 /dev/mapper/mpathb-part1 ba5eba11-0000-1111-2222-000000000001 LVM Physical Volume 568.0 Ready

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[sysadmin@controller-1 log(keystone_admin)]$ system host-disk-partition-list controller-1
System platform upgrade is in progress.
The command may display the target configuration that has not yet been applied to the host.
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
uuid device_path device_node type_guid type_name size_gib status

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
f1567b03-f0ce-45e0-af08-f39a2a4c3e8c /dev/disk/by-id/wwn-0x60002ac0000000000000000900029aaa-part4 /dev/mapper/mpatha-part4 e6d6d379-f507-44c2-a23c-238f2a3df928 Linux LVM 1197.409 In-Use
e1b5657a-9b51-4b08-bbcf-241663f583fd /dev/disk/by-id/wwn-0x60002ac0000000000000000a00029aaa-part1 /dev/mapper/mpathb-part1 ba5eba11-0000-1111-2222-000000000001 LVM Physical Volume 568.0 In-Use

Test Activity

Feature Testing (Upgrade)

Workaround

Describe workaround if available

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/905140

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/905140
Committed: https://opendev.org/starlingx/config/commit/f359cd29ffdf8c84da3fbad3bbc28aa94ad2c61d
Submitter: "Zuul (22348)"
Branch: master

commit f359cd29ffdf8c84da3fbad3bbc28aa94ad2c61d
Author: Lucas Borges <email address hidden>
Date: Tue Jan 9 14:54:53 2024 -0300

    Update conductor mpath partitions check

    The device node and device path is different on CentOs for multipath. On CentOs the device path used is /dev/disk/by-id/dm-uuid-* on Debian the
    correct path is /dev/disk/by-id/wwn-*. This change update the partition checking for Debian.

    Test Plan:

    PASS: Check after upgrade on DC with mpath
          if the partition table on sysinv are
          correct inventory
    PASS: Fresh install AIO-DX DC

    Closes-bug: 2048795
    Signed-off-by: Lucas Borges <email address hidden>
    Change-Id: Id2b28505afe401c2dfc22e0a621c1836d3a16ee0

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Low
tags: added: stx.9.0 stx.storage
Changed in starlingx:
assignee: nobody → Lucas Borges (lborges)
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.