All-in-one Simplex: During application-upload, stx-openstack is stuck.

Bug #1828899 reported by Maria Guadalupe Perez Ibara
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Critical
Daniel Badea

Bug Description

Brief Description
-----------------
During application-upload, stx-openstack is stuck, not moving forward.

Severity
--------
Major

Steps to Reproduce
------------------
system application-list

Expected Behavior
------------------
Applied

Actual Behavior
----------------
+---------------------+-----------------------------+-------------------------------+---------------+-----------+-----------+
| application | version | manifest name | manifest file | status | progress |
+---------------------+-----------------------------+-------------------------------+---------------+-----------+-----------+
| platform-integ-apps | 1.0-4 | platform-integration-manifest | manifest.yaml | applied | completed |
| stx-openstack | 1.0-13-centos-stable-latest | armada-manifest | manifest.yaml | uploading | None |
+---------------------+-----------------------------+-------------------------------+---------------+-----------+-----------+

Reproducibility
---------------
100% reproducible.

System Configuration
--------------------
Baremetal - Simplex

Branch/Pull Time/Commit
-----------------------
OS="centos"
SW_VERSION="19.01"
BUILD_TARGET="Host Installer"
BUILD_TYPE="Formal"
BUILD_ID="20190512T233000Z"

JOB="STX_build_master_master"
<email address hidden>"
BUILD_NUMBER="99"
BUILD_HOST="starlingx_mirror"
BUILD_DATE="2019-05-12 23:30:00 +0000"

Timestamp/Logs
--------------
Got this error message (related to sysinv.conductor.manager) on /var/log/sysinv.log:

2019-05-13 12:10:30.841 100153 INFO sysinv.conductor.manager [req-874696ac-e5f5-4b6b-888c-d9f31b438fb4 admin None] SYS_I Clear system config alarm: controller-0 target config 37cf9a0c-40b2-4663-aa69-1f7c5d3f109b
2019-05-13 12:15:31.438 100153 INFO sysinv.conductor.manager [req-97574225-d359-406e-82f1-60b1d3913a5b None None] Attempting to create new lldp neighbour {'host_id': 1, 'msap': u'02:04:96:a1:ca:aa,1:43', 'port_id': 3} on host 1
2019-05-13 12:15:31.633 100153 INFO sysinv.conductor.manager [req-97574225-d359-406e-82f1-60b1d3913a5b None None] Attempting to create new LLDP agent {'status': u'rx=enabled,tx=enabled', 'host_id': 1, 'port_id': 2} on host 1
2019-05-13 12:15:31.965 100153 INFO sysinv.conductor.manager [req-97574225-d359-406e-82f1-60b1d3913a5b None None] Attempting to create new LLDP agent {'status': u'rx=enabled,tx=enabled', 'host_id': 1, 'port_id': 3} on host 1

2 pods on error status, seems to be unrelated to above error

Test Activity
-------------
Sanity

Revision history for this message
Maria Guadalupe Perez Ibara (maria-gp) wrote :
summary: - During application-upload, stx-openstack is stuck, not moving forward
+ During application-upload, stx-openstack is stuck.
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Changing the bug title. Based on the sanity details, this issue was encountered on All-in-one simplex baremetal systems only. The issue was not encountered on Duplex or multi-node.

summary: - During application-upload, stx-openstack is stuck.
+ All-in-one Simplex: During application-upload, stx-openstack is stuck.
tags: added: stx.containers
Revision history for this message
Ghada Khalil (gkhalil) wrote :

@Maria, when was the last time this test-case passed on simplex? which load was used?

Changed in starlingx:
status: New → Incomplete
Revision history for this message
Ghada Khalil (gkhalil) wrote :

As per Bob Church, this is related to the ceph mgr-restful-plugin synchronization issue. This issue is also reported in https://bugs.launchpad.net/starlingx/+bug/1827521

tags: added: stx.2.0 stx.distro.other stx.storage
removed: stx.containers
Changed in starlingx:
importance: Undecided → Critical
status: Incomplete → Triaged
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating; critical priority as this issue is blocking simplex sanity. Issue is related to ceph upversion.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

This is the same root-cause as https://bugs.launchpad.net/starlingx/+bug/1827521 - marking as duplicate.

The fix was cherry-picked and merged in starlingx by pull request:
https://github.com/starlingx-staging/stx-ceph/pull/30

The fix should be available in May 15 loads.

Changed in starlingx:
assignee: nobody → Daniel Badea (daniel.badea)
Ghada Khalil (gkhalil)
Changed in starlingx:
status: Triaged → Fix Released
Ghada Khalil (gkhalil)
tags: added: stx.sanity
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.