Local docker registry access violation seen

Bug #1830436 reported by Nimalini Rasa
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Tee Ngo

Bug Description

Brief Description
-----------------
while the platform-integ-apps was applied it failed with the following reason:

 Image 192.168.204.2:9001/quay.io/external_storage/rbd-provisioner:v2.1.1-k8s1.11 download started from local registry
2019-05-24 19:01:10.151 105755 INFO sysinv.conductor.kube_app [-] Image 192.168.204.2:9001/docker.io/port/ceph-config-helper:v1.10.3 download started from local registry
2019-05-24 19:01:10.181 105755 ERROR sysinv.conductor.kube_app [-] Image 192.168.204.2:9001/quay.io/external_storage/rbd-provisioner:v2.1.1-k8s1.11 download failed from local registry: 500 Server Error: Internal Server Error ("Get https://192.168.204.2:9001/v2/: Access violation")
2019-05-24 19:01:10.212 105755 ERROR sysinv.conductor.kube_app [-] Image 192.168.204.2:9001/docker.io/port/ceph-config-helper:v1.10.3 download failed from local registry: 500 Server Error: Internal Server Error ("Get https://192.168.204.2:9001/v2/: Access violation")
2019-05-24 19:01:10.212 105755 ERROR sysinv.conductor.kube_app [-] Deployment of application platform-integ-apps (1.0-5) failed: failed to download one or more image(s).
2019-05-24 19:01:10.212 105755 TRACE sysinv.conductor.kube_app Traceback (most recent call last):
2019-05-24 19:01:10.212 105755 TRACE sysinv.conductor.kube_app File "/usr/lib64/python2.7/site-packages/sysinv/conductor/kube_app.py", line 1212, in perform_app_apply
2019-05-24 19:01:10.212 105755 TRACE sysinv.conductor.kube_app self._download_images(app)
2019-05-24 19:01:10.212 105755 TRACE sysinv.conductor.kube_app File "/usr/lib64/python2.7/site-packages/sysinv/conductor/kube_app.py", line 546, in _download_images
2019-05-24 19:01:10.212 105755 TRACE sysinv.conductor.kube_app reason="failed to download one or more image(s).")
2019-05-24 19:01:10.212 105755 TRACE sysinv.conductor.kube_app KubeAppApplyFailure: Deployment of application platform-integ-apps (1.0-5) failed: failed to download one or more image(s).
2019-05-24 19:01:10.212 105755 TRACE sysinv.conductor.kube_app

The system is configured with docker_proxy

Severity
--------
Major

Steps to Reproduce
------------------
Bring up a standard system with docker proxy

Expected Behavior
------------------
system should come up with out errors

Actual Behavior
----------------
platform-integ-apps failed

Reproducibility
---------------
seen once

System Configuration
--------------------
2+10 with docker proxy

Branch/Pull Time/Commit
-----------------------
Private build from 2019-05-23 pull

Last Pass
---------
Not known

Timestamp/Logs
--------------
2019-05-24 19:01:10.181

Test Activity
-------------
Platform Testing

Revision history for this message
Nimalini Rasa (nrasa) wrote :
Revision history for this message
Nimalini Rasa (nrasa) wrote :
Revision history for this message
Jerry Sun (jerry-sun-u) wrote :

The Docker registry on the controller seem to give "access violation" when interacted with through the docker client. Sysinv's commands for listing/deleting images/tags which interacts with the registry through restapi seem to work. Interacting with the registry restapi through curl also works.

I removed the proxy file from the proxy config (/etc/systemd/system/docker.service.d/http-proxy), which allowed me to interact with the registry through docker commands. I removed the "https" proxy line in the same config, and can still interact with the registry through docker commands.

I spoke with Bart and he suggests the lab should be installed with only 1 of http or https proxy set, which aligns with the results of me playing with the config. Please try reinstalling the lab with only 1 of http or https proxy

Numan Waheed (nwaheed)
tags: added: stx.retestneeded
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Nimalini, As per above, please retest with either http or https proxy.

tags: added: stx.containers
Changed in starlingx:
status: New → Incomplete
importance: Undecided → Medium
assignee: nobody → Nimalini Rasa (nrasa)
importance: Medium → High
Revision history for this message
Ghada Khalil (gkhalil) wrote :

Recommend testing with https proxy since it's more secure and more likely to be used in real life deployments

Revision history for this message
Jerry Sun (jerry-sun-u) wrote :

New lab with HTTPS proxy still saw the same issue. I was able to get around the issue by restarting docker: systemctl restart docker.service
I did not have to change any config this time. After restarting Docker, i was able to use docker login on the local registry and push images to it.

Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: Nimalini Rasa (nrasa) → Bart Wensley (bartwensley)
Changed in starlingx:
assignee: Bart Wensley (bartwensley) → Tee Ngo (teewrs)
Revision history for this message
Tee Ngo (teewrs) wrote :

Please retest with http proxy using latest load. For now, please provide docker_no_proxy list in the override file. NO_PROXY is the list of IPs that we don't want docker to go through the proxy, one of which is the local registry on the controller 192.168.204.2. This will allow successful image push to the local registry.

Please hold the https proxy test. The original implementation is flawed (confirmed by the feature developer). I will update this LP when the fix for https proxy is available.

Revision history for this message
Nimalini Rasa (nrasa) wrote :

Http proxy with explicitly adding docker_no_proxy in ansible conf file worked fine.

Ghada Khalil (gkhalil)
tags: added: stx.2.0
Changed in starlingx:
status: Incomplete → In Progress
Revision history for this message
Tee Ngo (teewrs) wrote :

For duplex, add the following docker_no_proxy list to the localhost.yml as a workaround
docker_no_proxy:
   - localhost
   - 127.0.0.1
   - 192.168.204.2
   - 192.168.204.3
   - 192.168.204.4
   - 10.10.10.2
   - 10.10.10.3
   - 10.10.10.4

For simplex:
docker_no_proxy:
   - localhost
   - 127.0.0.1
   - 192.168.204.2
   - 192.168.204.3
   - 10.10.10.2
   - 10.10.10.3

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/661825
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=8cdbd3048752eef152d02325139822afbf62b12c
Submitter: Zuul
Branch: master

commit 8cdbd3048752eef152d02325139822afbf62b12c
Author: Tee Ngo <email address hidden>
Date: Tue May 28 14:11:43 2019 -0400

    Use derived docker_no_proxy values if not provided

    Currently, if the user does not specify the docker_no_proxy list in
    the override file, NO_PROXY parameter is empty in http-proxy.conf
    file.

    In this commit, the NO_PROXY parameter is set using the combined
    docker_no_proxy list, which is a merged list of values derived
    from network config and user provided values if there are any.

    Closes-Bug: 1830436
    Change-Id: Ia72ce95bfbf5d06e14e8d2aa18c048f3c344b768
    Signed-off-by: Tee Ngo <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Peng Peng (ppeng) wrote :

Issue was not seen on load 2019-05-31_20-02-45

tags: removed: stx.retestneeded
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.