StarlingX

Containers: stx-openstack reapply gets stuck at applying armada-manifest with connection timeout error when retrieving release info

Bug #1817770 reported by Yang Liu on 2019-02-26

This bug report is a duplicate of: Bug #1817941: Containers: Reapply from controller-1 after swact hung on generating overrides for 10+ minutes due to RPC reponse timeout. Edit Remove

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	StarlingX	Fix Released	Medium	John Kung

Bug Description

Brief Description
-----------------
When reapply stx-openstack, it got stuck at applying armada-manifest due to connection timeout when retrieving release info for determine whether re-apply is needed.

Severity
--------
Minor

Steps to Reproduce
------------------
- Install and configure system
- Apply/reapply stx-openstack application

Expected Behavior
------------------
- stx-openstack application is successfully applied

Actual Behavior
----------------
- stx-openstack apply got stuck at armada-manifest due to connection timeout when retrieving release info for determine whether re-apply is needed.

Reproducibility
---------------
Intermittent

System Configuration
--------------------
Multi-node system

Branch/Pull Time/Commit
-----------------------
f/stein as of 2019-02-25

Timestamp/Logs
--------------
[2019-02-26 16:20:37,367] 262 DEBUG MainThread ssh.send :: Send 'system --os-username 'admin' --os-password 'Li69nux*' --os-project-name admin --os-auth-url http://192.168.204.2:5000/v3 --os-user-domain-name Default --os-project-domain-name Default --os-endpoint-type internalURL --os-region-name RegionOne application-apply stx-openstack'

2019-02-26 16:21:50.884 18 DEBUG armada.handlers.tiller [-] Getting known releases from Tiller... list_charts /usr/local/lib/python3.5/site-packages/armada/handlers/tiller.py:286
2019-02-26 16:21:50.885 18 DEBUG armada.handlers.tiller [-] Tiller ListReleases() with timeout=300 list_releases /usr/local/lib/python3.5/site-packages/armada/handlers/tiller.py:205
2019-02-26 16:26:50.886 18 ERROR armada.cli [-] Caught unexpected exception: grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded)>
2019-02-26 16:26:50.886 18 ERROR armada.cli Traceback (most recent call last):
2019-02-26 16:26:50.886 18 ERROR armada.cli File "/usr/local/lib/python3.5/site-packages/armada/cli/__init__.py", line 39, in safe_invoke
2019-02-26 16:26:50.886 18 ERROR armada.cli self.invoke()
2019-02-26 16:26:50.886 18 ERROR armada.cli File "/usr/local/lib/python3.5/site-packages/armada/cli/apply.py", line 217, in invoke
2019-02-26 16:26:50.886 18 ERROR armada.cli resp = armada.sync()
2019-02-26 16:26:50.886 18 ERROR armada.cli File "/usr/local/lib/python3.5/site-packages/armada/handlers/armada.py", line 234, in sync
2019-02-26 16:26:50.886 18 ERROR armada.cli deployed_releases, failed_releases = self._get_releases_by_status()
2019-02-26 16:26:50.886 18 ERROR armada.cli File "/usr/local/lib/python3.5/site-packages/armada/handlers/armada.py", line 200, in _get_releases_by_status
2019-02-26 16:26:50.886 18 ERROR armada.cli known_releases = self.tiller.list_charts()
2019-02-26 16:26:50.886 18 ERROR armada.cli File "/usr/local/lib/python3.5/site-packages/armada/handlers/tiller.py", line 288, in list_charts
2019-02-26 16:26:50.886 18 ERROR armada.cli for latest_release in self.list_releases():
2019-02-26 16:26:50.886 18 ERROR armada.cli File "/usr/local/lib/python3.5/site-packages/armada/handlers/tiller.py", line 209, in list_releases
2019-02-26 16:26:50.886 18 ERROR armada.cli for y in release_list:
2019-02-26 16:26:50.886 18 ERROR armada.cli File "/usr/local/lib/python3.5/site-packages/grpc/_channel.py", line 347, in _next_
2019-02-26 16:26:50.886 18 ERROR armada.cli return self._next()
2019-02-26 16:26:50.886 18 ERROR armada.cli File "/usr/local/lib/python3.5/site-packages/grpc/_channel.py", line 341, in _next
2019-02-26 16:26:50.886 18 ERROR armada.cli raise self
2019-02-26 16:26:50.886 18 ERROR armada.cli grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded)>
2019-02-26 16:26:50.886 18 ERROR armada.cli
armada@f91232ac6052:~$

[storage/driver] 2019/02/26 16:37:18 list: failed to list: Get https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps?labelSelector=OWNER%3DTILLER: read tcp 172.16.0.112:49838->10.96.0.1:443: read: connection timed out

Tags:

Ghada Khalil (gkhalil) on 2019-02-27

tags:

added: stx.containers

Ghada Khalil (gkhalil) on 2019-02-27

Changed in starlingx:
importance:	Undecided → Medium
assignee:	nobody → Tee Ngo (teewrs)
status:	New → Triaged
tags:	added: stx.2019.05

Dariush Eslimi (deslimi) on 2019-03-04

Changed in starlingx:
assignee:	Tee Ngo (teewrs) → John Kung (john-kung)

Ken Young (kenyis) on 2019-04-05

tags:

added: stx.2.0
removed: stx.2019.05

Ghada Khalil (gkhalil) on 2019-04-09

tags:

added: stx.retestneeded

Revision history for this message

John Kung (john-kung) wrote on 2019-04-23:

The following error occurs on the helm list attempt triggered after a host-swact away from the controller running the tiller pod:
(see: https://bugs.launchpad.net/starlingx/+bug/1817941)

"[storage/driver] 2019/02/26 16:37:18 list: failed to list: Get https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps?labelSelector=OWNER%3DTILLER: read tcp 172.16.0.112:49838->10.96.0.1:443: read: connection timed out"

Therefore, this bug can be tracked as a duplicate to 1817941.

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-06-07:

Duplicate bug was fixed on 2019-05-07
https://review.opendev.org/657087

Marking as Fix Released

Changed in starlingx:
status:	Triaged → Fix Released

Revision history for this message

Yang Liu (yliu12) wrote on 2019-06-10:

Test passed on following load: 2019-06-03_18-34-53.
helm list worked and reapply completed shortly after swact.

tags:

removed: stx.retestneeded

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1817941 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.