host status is degraded after host swact due to drbd-extension

Bug #1796124 reported by Peng Peng
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Invalid
Medium
Erich Cordoba

Bug Description

Brief Description
-----------------
6 mins after host swact, host was able to connected, but host was in degraded status

Severity
--------
Major

Steps to Reproduce
------------------
host-swact

Expected Behavior
------------------
swact success and hosts are in available status

Actual Behavior
----------------
host is in degraded status

Reproducibility
---------------
Intermittent

System Configuration
--------------------
Two node system,

Branch/Pull Time/Commit
-----------------------
stx.2018.10 release branch build as of 2018-10-03_09-47-10

Timestamp/Logs
--------------
[2018-10-04 12:12:16,184] 262 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-region-name RegionOne host-swact controller-1'

[2018-10-04 12:18:03,952] 767 INFO MainThread ssh.close :: connection closed. host: 128.224.150.254, user: wrsroot. Object ID: 139932073271368
[2018-10-04 12:18:03,952] 222 DEBUG MainThread ssh.connect :: Retry in 3 seconds
[2018-10-04 12:18:06,955] 138 INFO MainThread ssh.connect :: Attempt to connect to host - 128.224.150.254
[2018-10-04 12:18:08,326] 262 DEBUG MainThread ssh.send :: Send ''
[2018-10-04 12:18:08,443] 382 DEBUG MainThread ssh.expect :: Output:
controller-0:~$
[2018-10-04 12:18:08,443] 161 INFO MainThread ssh.connect :: Login successful!

[2018-10-04 12:18:26,341] 262 DEBUG MainThread ssh.send :: Send 'openstack --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-identity-api-version 3 --os-region-name RegionOne role assignment list --names --user tenant2 --role admin --project tenant2'
[2018-10-04 12:18:27,928] 382 DEBUG MainThread ssh.expect :: Output:
Failed to discover available identity versions when contacting http://192.168.204.2:5000/v3. Attempting to parse version from URL.
Unable to establish connection to http://192.168.204.2:5000/v3/auth/tokens: HTTPConnectionPool(host='192.168.204.2', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f168b6c9e10>: Failed to establish a new connection: [Errno 111] Connection refused',))

[2018-10-04 12:18:28,094] 262 DEBUG MainThread ssh.send :: Send 'fm --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-region-name RegionOne alarm-list --nowrap --uuid'
[2018-10-04 12:18:59,251] 382 DEBUG MainThread ssh.expect :: Output:
+--------------------------------------+----------+-------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------+----------+----------------------------+
| UUID | Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
+--------------------------------------+----------+-------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------+----------+----------------------------+
| 82ca1379-6505-4e9c-93b8-6b72543c9b26 | 400.001 | Service group controller-services degraded; drbd-extension(enabled-active, degraded, data-standalone) | service_domain=controller.service_group=controller-services.host=controller-0 | major | 2018-10-04T12:18:47.987614 |
| 9fbfdd5b-c9fa-40ab-bf0b-6f1a567659ba | 400.002 | Service group cloud-services has no active members available; expected 1 active member | service_domain=controller.service_group=cloud-services | critical | 2018-10-04T12:18:20.120716 |
+--------------------------------------+----------+-------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------+----------+----------------------------+
controller-0:~$

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Assigning to Kevin Smith to triage

Changed in starlingx:
assignee: nobody → Kevin Smith (kevin.smith.wrs)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Targeting stx.2019.03 as the issue is intermittent

tags: added: stx.2019.03 stx.distro.other
Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
Ghada Khalil (gkhalil)
summary: - STX: host status is degraded after host swact
+ host status is degraded after host swact
Ghada Khalil (gkhalil)
summary: - host status is degraded after host swact
+ host status is degraded after host swact due to drbd-extension
Changed in starlingx:
assignee: Kevin Smith (kevin.smith.wrs) → Bruce Jones (brucej)
Bruce Jones (brucej)
Changed in starlingx:
assignee: Bruce Jones (brucej) → Cesar Lara (clara1)
Ken Young (kenyis)
tags: added: stx.2019.05
removed: stx.2019.03
Changed in starlingx:
assignee: Cesar Lara (clara1) → Victor Manuel Rodriguez Bahena (vm-rod25)
Cindy Xie (xxie1)
Changed in starlingx:
assignee: Victor Manuel Rodriguez Bahena (vm-rod25) → Cesar Lara (clara1)
Cesar Lara (clara1)
Changed in starlingx:
assignee: Cesar Lara (clara1) → Erich Cordoba (ericho)
Revision history for this message
Erich Cordoba (ericho) wrote :

I will try to reproduce by setting a duplex system and perform a host-swact.

Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
Ghada Khalil (gkhalil)
tags: added: stx.retestneeded
Revision history for this message
Cindy Xie (xxie1) wrote :

@peng, can you retest and see if this is still repro in latest build?

Revision history for this message
Cindy Xie (xxie1) wrote :

just report see the bug in the latest ISO again, please reopen it.

Changed in starlingx:
status: Triaged → Invalid
Revision history for this message
Peng Peng (ppeng) wrote :

Issue was not able to reproduced.

tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.