All-in-one Simplex: During application-upload, stx-openstack is stuck.

Bug #1828899 reported by Maria Guadalupe Perez Ibara
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Critical
Daniel Badea

Bug Description

Brief Description
-----------------
During application-upload, stx-openstack is stuck, not moving forward.

Severity
--------
Major

Steps to Reproduce
------------------
system application-list

Expected Behavior
------------------
Applied

Actual Behavior
----------------
+---------------------+-----------------------------+-------------------------------+---------------+-----------+-----------+
| application | version | manifest name | manifest file | status | progress |
+---------------------+-----------------------------+-------------------------------+---------------+-----------+-----------+
| platform-integ-apps | 1.0-4 | platform-integration-manifest | manifest.yaml | applied | completed |
| stx-openstack | 1.0-13-centos-stable-latest | armada-manifest | manifest.yaml | uploading | None |
+---------------------+-----------------------------+-------------------------------+---------------+-----------+-----------+

Reproducibility
---------------
100% reproducible.

System Configuration
--------------------
Baremetal - Simplex

Branch/Pull Time/Commit
-----------------------
OS="centos"
SW_VERSION="19.01"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="20190512T233000Z"

JOB="STX_build_master_master"
<email address hidden>"
BUILD_NUMBER="99"
BUILD_HOST="starlingx_mirror"
BUILD_DATE="2019-05-12 23:30:00 +0000"

Timestamp/Logs
--------------
Got this error message (related to sysinv.conductor.manager) on /var/log/sysinv.log:

2019-05-13 12:10:30.841 100153 INFO sysinv.conductor.manager [req-874696ac-e5f5-4b6b-888c-d9f31b438fb4 admin None] SYS_I Clear system config alarm: controller-0 target config 37cf9a0c-40b2-4663-aa69-1f7c5d3f109b
2019-05-13 12:15:31.438 100153 INFO sysinv.conductor.manager [req-97574225-d359-406e-82f1-60b1d3913a5b None None] Attempting to create new lldp neighbour {'host_id': 1, 'msap': u'02:04:96:a1:ca:aa,1:43', 'port_id': 3} on host 1
2019-05-13 12:15:31.633 100153 INFO sysinv.conductor.manager [req-97574225-d359-406e-82f1-60b1d3913a5b None None] Attempting to create new LLDP agent {'status': u'rx=enabled,tx=enabled', 'host_id': 1, 'port_id': 2} on host 1
2019-05-13 12:15:31.965 100153 INFO sysinv.conductor.manager [req-97574225-d359-406e-82f1-60b1d3913a5b None None] Attempting to create new LLDP agent {'status': u'rx=enabled,tx=enabled', 'host_id': 1, 'port_id': 3} on host 1

2 pods on error status, seems to be unrelated to above error

Test Activity
-------------
Sanity

Revision history for this message
Maria Guadalupe Perez Ibara (maria-gp) wrote :
summary: - During application-upload, stx-openstack is stuck, not moving forward
+ During application-upload, stx-openstack is stuck.
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Changing the bug title. Based on the sanity details, this issue was encountered on All-in-one simplex baremetal systems only. The issue was not encountered on Duplex or multi-node.

summary: - During application-upload, stx-openstack is stuck.
+ All-in-one Simplex: During application-upload, stx-openstack is stuck.
tags: added: stx.containers
Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Maria, when was the last time this test-case passed on simplex? which load was used?

Changed in starlingx:
status: New → Incomplete
Revision history for this message
Ghada Khalil (gkhalil) wrote :

As per Bob Church, this is related to the ceph mgr-restful-plugin synchronization issue. This issue is also reported in https://bugs.launchpad.net/starlingx/+bug/1827521

tags: added: stx.2.0 stx.distro.other stx.storage
removed: stx.containers
Changed in starlingx:
importance: Undecided → Critical
status: Incomplete → Triaged
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; critical priority as this issue is blocking simplex sanity. Issue is related to ceph upversion.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

This is the same root-cause as https://bugs.launchpad.net/starlingx/+bug/1827521 - marking as duplicate.

The fix was cherry-picked and merged in starlingx by pull request:
https://github.com/starlingx-staging/stx-ceph/pull/30

The fix should be available in May 15 loads.

Changed in starlingx:
assignee: nobody → Daniel Badea (daniel.badea)
Ghada Khalil (gkhalil)
Changed in starlingx:
status: Triaged → Fix Released
Ghada Khalil (gkhalil)
tags: added: stx.sanity
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers