Comment 2 for bug 1877582

Revision history for this message
Bob Church (rchurch) wrote :

System was unlocked. Platform managed application try to upload. The armada container attempts to start but fails. It is in an exited status but appears to be running after the reboot and we have a port conflict: bind: address already in use: unknown"
The workaround for this is to run the following, then the apps will auto-upload/auto-apply
- docker rm 4435db3ceb9f
- system application-delete platform-integ-apps; system application-delete oidc-auth-apps

$ sudo docker ps -a
Password:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4435db3ceb9f registry.local:9001/quay.io/airshipit/armada:8a1638098f88d92bf799ef4934abe569789b885e-ubuntu_bionic "./entrypoint.sh ser…" 3 hours ago Exited (128) 3 hours ago armada_service

root 88294 1 0 11:55 ? 00:00:00 containerd-shim -namespace moby -workdir /var/lib/docker/io.containerd.runtime.v1.linux/moby/4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64 -address /var/run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc

2020-05-08T11:49:20.231 + Host Info +--------------------------------------+
2020-05-08T11:49:20.231 | action : unlock
2020-05-08T11:49:20.231 | personality: controller
2020-05-08T11:49:20.231 | hostname : controller-0
2020-05-08T11:49:20.231 | task : none
2020-05-08T11:49:20.231 | info : none
2020-05-08T11:49:20.231 | ip : fd01:14::3
2020-05-08T11:49:20.231 | mac : 48:df:37:22:c5:f0
2020-05-08T11:49:20.231 | uuid : 25fed003-05ff-44c6-ac1e-88d67a8cc808
2020-05-08T11:49:20.232 | adminState: locked
2020-05-08T11:49:20.232 | operState: disabled
2020-05-08T11:49:20.232 | availStatus: online
2020-05-08T11:49:20.232 | bm ip : none
2020-05-08T11:49:20.232 | bm un : none
2020-05-08T11:49:20.232 | bm type : none
2020-05-08T11:49:20.232 | subFunction: controller,worker
2020-05-08T11:49:20.232 | operState: disabled
2020-05-08T11:49:20.232 | availStatus: online
2020-05-08T11:49:20.232 +------------+--------------------------------------+

2020-05-08T11:49:54.404 subcloud7 containerd[116275]: info time="2020-05-08T11:49:54.404297821Z" level=info msg="shim reaped" id=4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64
2020-05-08T11:49:54.433 subcloud7 systemd[1]: info Unmounted /var/lib/docker/containers/4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64/mounts/shm.

reboot system boot 3.10.0-1127.el7. Fri May 8 11:52 - 14:44 (02:51)

2020-05-08T11:55:55.709 controller-0 containerd[1962]: info time="2020-05-08T11:55:55.709515678Z" level=info msg="shim reaped" id=4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64
2020-05-08T11:55:56.452 controller-0 dockerd[88042]: info time="2020-05-08T11:55:56.451978430Z" level=error msg="4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64 cleanup: failed to delete container from containerd: no such container"
2020-05-08T11:55:56.452 controller-0 dockerd[88042]: info time="2020-05-08T11:55:56.452008031Z" level=error msg="Failed to start container 4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64: transport is closing: unavailable"
2020-05-08T11:57:44.573 controller-0 dockerd[92586]: info time="2020-05-08T11:57:44.573919610Z" level=error msg="4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64 cleanup: failed to delete container from containerd: no such container"
2020-05-08T11:57:44.574 controller-0 dockerd[92586]: info time="2020-05-08T11:57:44.573946897Z" level=error msg="Failed to start container 4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64: failed to listen to abstract unix socket \"/containerd-shim/589f4f63f03c1a4a7664785094eb95bbb49ec3d4fb950a96bb994057abe8e7fe.sock\": listen unix \x00/containerd-shim/589f4f63f03c1a4a7664785094eb95bbb49ec3d4fb950a96bb994057abe8e7fe.sock: bind: address already in use: unknown"

2020-05-08 11:58:26.686 99806 INFO sysinv.conductor.manager [-] sysinv-conductor start committed system={'system_mode': u'simplex', 'created_at': datetime.datetime(2020, 5, 8, 11, 21, 25, 446979, tzinfo=<iso8601.Utc>), 'uuid': u'd57062fe-6d45-40c5-8424-ca1be32a5c15', 'software_version': u'20.04', 'service_project_name': u'services', 'system_type': u'All-in-one', 'name': u'dc-subcloud7', 'description': None, 'location': None, 'updated_at': datetime.datetime(2020, 5, 8, 11, 46, 3, 340420, tzinfo=<iso8601.Utc>), 'capabilities': {u'https_enabled': False, u'vswitch_type': u'none', u'sdn_enabled': False, u'region_config': True, u'shared_services': u"['identity', ]"}, 'id': 1, 'contact': None, 'security_feature': u'nopti nospectre_v2 nospectre_v1', 'services': 72, 'timezone': u'UTC', 'security_profile': u'standard', 'distributed_cloud_role': u'subcloud', 'region_name': u'subcloud7'}

2020-05-08 11:59:29.046 99806 INFO sysinv.conductor.kube_app [-] Starting Armada service...
2020-05-08 11:59:29.046 99806 INFO sysinv.conductor.kube_app [-] kube_config=/opt/platform/armada/20.04/admin.conf, manifests_dir=/opt/platform/armada/20.04, overrides_dir=/opt/platform/helm/20.04, logs_dir=/var/log/armada.

2020-05-08T11:59:28.622 controller-0 dockerd[92586]: info time="2020-05-08T11:59:28.621964717Z" level=error msg="4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64 cleanup: failed to delete container from containerd: no such container"
2020-05-08T11:59:29.045 controller-0 dockerd[92586]: info time="2020-05-08T11:59:29.045135698Z" level=error msg="Handler for POST /v1.35/containers/4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64/restart returned error: Cannot restart container 4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64: failed to listen to abstract unix socket \"/containerd-shim/589f4f63f03c1a4a7664785094eb95bbb49ec3d4fb950a96bb994057abe8e7fe.sock\": listen unix \x00/containerd-shim/589f4f63f03c1a4a7664785094eb95bbb49ec3d4fb950a96bb994057abe8e7fe.sock: bind: address already in use: unknown"

2020-05-08 11:59:29.061 99806 ERROR sysinv.conductor.kube_app [-] Upload of application platform-integ-apps (1.0-8) failed: Failed to validate application manifest.: KubeAppUploadFailure: Upload of application platform-integ-apps (
1.0-8) failed: Failed to validate application manifest.
2020-05-08 11:59:29.061 99806 ERROR sysinv.conductor.kube_app Traceback (most recent call last):
2020-05-08 11:59:29.061 99806 ERROR sysinv.conductor.kube_app File "/usr/lib64/python2.7/site-packages/sysinv/conductor/kube_app.py", line 1928, in perform_app_upload
2020-05-08 11:59:29.061 99806 ERROR sysinv.conductor.kube_app reason="Failed to validate application manifest."
2020-05-08 11:59:29.061 99806 ERROR) sysinv.conductor.kube_app KubeAppUploadFailure: Upload of application platform-integ-apps (1.0-8) failed: Failed to validate application manifest.
2020-05-08 11:59:29.061 99806 ERROR sysinv.conductor.kube_app

2020-05-08T11:59:29.842 controller-0 dockerd[92586]: info time="2020-05-08T11:59:29.842122462Z" level=error msg="4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64 cleanup: failed to delete container from containerd: no such container"
2020-05-08T11:59:29.953 controller-0 dockerd[92586]: info time="2020-05-08T11:59:29.953118513Z" level=error msg="Handler for POST /v1.35/containers/4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64/restart returned error: Cannot restart container 4435db3ceb9f5e72676dfb60ca6285751a69f5306dad457a64c6c23a412eac64: failed to listen to abstract unix socket \"/containerd-shim/589f4f63f03c1a4a7664785094eb95bbb49ec3d4fb950a96bb994057abe8e7fe.sock\": listen unix \x00/containerd-shim/589f4f63f03c1a4a7664785094eb95bbb49ec3d4fb950a96bb994057abe8e7fe.sock: bind: address already in use: unknown"
2020-05-08 11:59:29.954 99806 INFO sysinv.conductor.kube_app [-] Starting Armada service...
2020-05-08 11:59:29.954 99806 INFO sysinv.conductor.kube_app [-] kube_config=/opt/platform/armada/20.04/admin.conf, manifests_dir=/opt/platform/armada/20.04, overrides_dir=/opt/platform/helm/20.04, logs_dir=/var/log/armada.
2020-05-08 11:59:29.975 99806 ERROR sysinv.conductor.kube_app [-] Upload of application oidc-auth-apps (1.0-0) failed: Failed to validate application manifest.: KubeAppUploadFailure: Upload of application oidc-auth-apps (1.0-0) failed: Failed to validate application manifest.
2020-05-08 11:59:29.975 99806 ERROR sysinv.conductor.kube_app Traceback (most recent call last):
2020-05-08 11:59:29.975 99806 ERROR sysinv.conductor.kube_app File "/usr/lib64/python2.7/site-packages/sysinv/conductor/kube_app.py", line 1928, in perform_app_upload
2020-05-08 11:59:29.975 99806 ERROR sysinv.conductor.kube_app reason="Failed to validate application manifest.")
2020-05-08 11:59:29.975 99806 ERROR sysinv.conductor.kube_app KubeAppUploadFailure: Upload of application oidc-auth-apps (1.0-0) failed: Failed to validate application manifest.
2020-05-08 11:59:29.975 99806 ERROR sysinv.conductor.kube_app