[MultiOS][Yocto] AIO simplex failed to unclock somtimes

Bug #1899648 reported by Jackie Huang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Committed
Low
Jackie Huang

Bug Description

Brief Description
-----------------
 AIO simplex failed to unclock somtimes

Severity
--------
<Major: System/Feature is usable but degraded>

Steps to Reproduce
------------------
1. build the image according to https://opendev.org/starlingx/meta-starlingx/src/branch/master/README.md
2. Install AIO simplex with the built out image
3. run ansiple playbootk and unclock

Expected Behavior
------------------
The controller-0 should be unlocked successfully

Actual Behavior
----------------
[sysadmin@controller-0 ~(keystone_admin)]$ system host-unlock controller-0
Operation Rejected: no response. retry.
[sysadmin@controller-0 ~(keystone_admin)]$ system host-unlock controller-0
Interface enp7s1 on host controller-0 is configured for IPv4 static address but has no configured IPv4 address

Reproducibility
---------------
intermittent

System Configuration
--------------------
One node system, All-in-one simplex

Branch/Pull Time/Commit
-----------------------
Branch: master
Time: Sep 25 2020
Commit: 595181564502bc04028132b3af68ff551b6cbe47

Last Pass
---------

Timestamp/Logs
--------------

Test Activity
-------------

Workaround
----------

Changed in starlingx:
assignee: nobody → Jackie Huang (jackie-huang)
status: New → Confirmed
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Low / doesn't gate the next release as this appears to be prep work for multi-os support

Changed in starlingx:
importance: Undecided → Low
Revision history for this message
Jackie Huang (jackie-huang) wrote :

No addresses found:
system -v --debug host-addr-list controller-0

reply: 'HTTP/1.1 200 OK\r\n'
header: Content-Length: 17
header: Content-Type: application/json
header: Date: Fri, 23 Oct 2020 01:06:07 GMT
DEBUG (http:157) RESP: {"addresses": []}

Changed in starlingx:
status: Confirmed → In Progress
Revision history for this message
Jackie Huang (jackie-huang) wrote :

Possible root cause:

user.log:2020-10-23T01:31:30.759 localhost systemd-coredump[669905]: crit Process 669903 (mtcAgent) of user 0 dumped core.

Revision history for this message
Jackie Huang (jackie-huang) wrote :

work around:

export OCF_ROOT=/usr/lib/ocf
export OCF_RESKEY_state=active
/usr/lib/ocf/resource.d/platform/mtcAgent start
/usr/lib/ocf/resource.d/platform/mtcAgent status

OAM_IF=enp7s1
MGMT_IF=enp7s2
IFNET_UUIDS=$(system interface-network-list controller-0 | awk '{if ($6=="enp7s1" || $6=="enp7s2") print $4;}')
for UUID in $IFNET_UUIDS; do
    system interface-network-remove ${UUID}
done
system interface-network-assign controller-0 $OAM_IF oam
system interface-network-assign controller-0 $MGMT_IF mgmt
system interface-network-assign controller-0 $MGMT_IF cluster-host

Revision history for this message
Jackie Huang (jackie-huang) wrote :

This can be reproduced with the latest:
Branch: master
Time: Nov 2 2020
Commit: 29d7ea1b698e5b38d4876a2a11555c92e8e85e4f

Changed in starlingx:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.