applying custom system application on controller-1 failed

Bug #1883555 reported by Difu Hu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Won't Fix
Medium
Dan Voiculeasa

Bug Description

Brief Description
-----------------
applying hello-kitty on controller-1 failed with armada error "Name resolution failure"

Severity
--------
Major

Steps to Reproduce
------------------
apply hello-kitty on controller-1

Expected Behavior
------------------
hello-kitty is applied

Actual Behavior
----------------
hello-kitty is in apply-failed status

Reproducibility
---------------
Intermittent - happened 3/3 times on one specific system

System Configuration
--------------------
Lab-name: DC-3 controller

Branch/Pull Time/Commit
-----------------------
2020-06-12_20-00-00

Last Pass
---------
2020-06-05_20-00-00

Timestamp/Logs
--------------
+--------------------------+---------+-----------------------------------+----------------------------------------+--------------+---------------------------------------+
| application | version | manifest name | manifest file | status | progress |
+--------------------------+---------+-----------------------------------+----------------------------------------+--------------+---------------------------------------+
| hello-kitty | 1.16 | hello-kitty | manifest.yaml | apply-failed | operation aborted, check logs for |
| | | | | | detail |

2020-06-14 18:03:49.504 46 DEBUG armada.handlers.tiller [-] Tiller ListReleases() with timeout=300, request=limit: 32
status_codes: UNKNOWN
status_codes: DEPLOYED
status_codes: DELETED
status_codes: DELETING
status_codes: FAILED
status_codes: PENDING_INSTALL
status_codes: PENDING_UPGRADE
status_codes: PENDING_ROLLBACK
 get_results /usr/local/lib/python3.6/dist-packages/armada/handlers/tiller.py:215
2020-06-14 18:04:00.504 46 INFO armada.handlers.lock [-] Releasing lock
2020-06-14 18:04:00.508 46 ERROR armada.cli [-] Caught unexpected exception: grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
 status = StatusCode.UNAVAILABLE
 details = "Name resolution failure"
 debug_error_string = "{"created":"@1592157839.546890865","description":"Failed to create subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":2721,"referenced_errors":[{"created":"@1592157839.546887131","description":"Name resolution failure","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3026,"grpc_status":14}]}"
>
2020-06-14 18:04:00.508 46 ERROR armada.cli Traceback (most recent call last):
2020-06-14 18:04:00.508 46 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/cli/__init__.py", line 38, in safe_invoke
2020-06-14 18:04:00.508 46 ERROR armada.cli self.invoke()
2020-06-14 18:04:00.508 46 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/cli/apply.py", line 213, in invoke
2020-06-14 18:04:00.508 46 ERROR armada.cli resp = self.handle(documents, tiller)
2020-06-14 18:04:00.508 46 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/handlers/lock.py", line 81, in func_wrapper
2020-06-14 18:04:00.508 46 ERROR armada.cli return future.result()
2020-06-14 18:04:00.508 46 ERROR armada.cli File "/usr/lib/python3.6/concurrent/futures/_base.py", line 425, in result
2020-06-14 18:04:00.508 46 ERROR armada.cli return self.__get_result()
2020-06-14 18:04:00.508 46 ERROR armada.cli File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
2020-06-14 18:04:00.508 46 ERROR armada.cli raise self._exception
2020-06-14 18:04:00.508 46 ERROR armada.cli File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
2020-06-14 18:04:00.508 46 ERROR armada.cli result = self.fn(*self.args, **self.kwargs)
2020-06-14 18:04:00.508 46 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/cli/apply.py", line 256, in handle
2020-06-14 18:04:00.508 46 ERROR armada.cli return armada.sync()
2020-06-14 18:04:00.508 46 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/handlers/armada.py", line 189, in sync
2020-06-14 18:04:00.508 46 ERROR armada.cli known_releases = self.tiller.list_releases()
2020-06-14 18:04:00.508 46 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/handlers/tiller.py", line 252, in list_releases
2020-06-14 18:04:00.508 46 ERROR armada.cli releases = get_results()
2020-06-14 18:04:00.508 46 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/handlers/tiller.py", line 220, in get_results
2020-06-14 18:04:00.508 46 ERROR armada.cli for message in response:
2020-06-14 18:04:00.508 46 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 364, in __next__
2020-06-14 18:04:00.508 46 ERROR armada.cli return self._next()
2020-06-14 18:04:00.508 46 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 358, in _next
2020-06-14 18:04:00.508 46 ERROR armada.cli raise self
2020-06-14 18:04:00.508 46 ERROR armada.cli grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
2020-06-14 18:04:00.508 46 ERROR armada.cli status = StatusCode.UNAVAILABLE
2020-06-14 18:04:00.508 46 ERROR armada.cli details = "Name resolution failure"
2020-06-14 18:04:00.508 46 ERROR armada.cli debug_error_string = "{"created":"@1592157839.546890865","description":"Failed to create subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":2721,"referenced_errors":[{"created":"@1592157839.546887131","description":"Name resolution failure","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3026,"grpc_status":14}]}"

Test Activity
-------------
Regression Testing

Revision history for this message
Difu Hu (difuhu) wrote :
Revision history for this message
Difu Hu (difuhu) wrote :

Seems it has similar root cause as https://bugs.launchpad.net/starlingx/+bug/1882485

Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.4.0 / medium priority - issue seems to be reproducible if controller-1 is active

tags: added: stx.apps stx.containers
Changed in starlingx:
importance: Undecided → Medium
assignee: nobody → Dan Voiculeasa (dvoicule)
status: New → Triaged
tags: added: stx.4.0
Yang Liu (yliu12)
summary: - applying hello-kitty on controller-1 failed
+ applying custom system application on controller-1 failed
Revision history for this message
Nimalini Rasa (nrasa) wrote :
Download full text (4.1 KiB)

Also seen in DC lab for oidc-auth-app, app went to apply failed state after swacting with the following error (Name resolution):
get_results /usr/local/lib/python3.6/dist-packages/armada/handlers/tiller.py:215^[[00m
2020-06-17 14:15:37.255 16 INFO armada.handlers.lock [-] Releasing lock^[[00m
2020-06-17 14:15:37.259 16 ERROR armada.cli [-] Caught unexpected exception: grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "Name resolution failure"
        debug_error_string = "{"created":"@1592403336.274257011","description":"Failed to create subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":2721,"referenced_errors":[{"created":"@1592403336.274254598","description":"Name resolution failure","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3026,"grpc_status":14}]}"
>
2020-06-17 14:15:37.259 16 ERROR armada.cli Traceback (most recent call last):
2020-06-17 14:15:37.259 16 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/cli/__init__.py", line 38, in safe_invoke
2020-06-17 14:15:37.259 16 ERROR armada.cli self.invoke()
2020-06-17 14:15:37.259 16 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/cli/apply.py", line 213, in invoke
2020-06-17 14:15:37.259 16 ERROR armada.cli resp = self.handle(documents, tiller)
2020-06-17 14:15:37.259 16 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/handlers/lock.py", line 81, in func_wrapper
2020-06-17 14:15:37.259 16 ERROR armada.cli return future.result()
2020-06-17 14:15:37.259 16 ERROR armada.cli File "/usr/lib/python3.6/concurrent/futures/_base.py", line 425, in result
2020-06-17 14:15:37.259 16 ERROR armada.cli return self.__get_result()
2020-06-17 14:15:37.259 16 ERROR armada.cli File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
2020-06-17 14:15:37.259 16 ERROR armada.cli raise self._exception
2020-06-17 14:15:37.259 16 ERROR armada.cli File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
2020-06-17 14:15:37.259 16 ERROR armada.cli result = self.fn(*self.args, **self.kwargs)
2020-06-17 14:15:37.259 16 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/cli/apply.py", line 256, in handle
2020-06-17 14:15:37.259 16 ERROR armada.cli return armada.sync()
2020-06-17 14:15:37.259 16 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/handlers/armada.py", line 189, in sync
2020-06-17 14:15:37.259 16 ERROR armada.cli known_releases = self.tiller.list_releases()
2020-06-17 14:15:37.259 16 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/handlers/tiller.py", line 252, in list_releases
2020-06-17 14:15:37.259 16 ERROR armada.cli releases = get_results()
2020-06-17 14:15:37.259 16 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/armada/handlers/tiller.py", line 220, in get_results
2020-06-17 14:15:37.259 16 ERROR armada.cli for message in response:
2020-06-17 14:15:37.259 16 ERROR armada.cli File "/usr/local/lib/python3.6/dist-packages/grpc/_channel.py", line 364, in __next__
2020-06-17 14:15:37.259 16 ERROR a...

Read more...

Revision history for this message
Frank Miller (sensfan22) wrote :

Moved the tag to stx.5.0 as this issue is with a customer app called hello-kitty which is not part of the STX platform. If it is learned that all apps have a similar failure signature then we'll re-consider at that time whether to port a fix back to stx.4.0.

tags: added: stx.5.0
removed: stx.4.0
Revision history for this message
Frank Miller (sensfan22) wrote :

This issue is no longer seen. If the issue starts to re-occur please open a new LP with a recent load.

Changed in starlingx:
status: Triaged → Won't Fix
Ghada Khalil (gkhalil)
tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.