Failed to pull calico/node:v3.6.1 image through proxy

Bug #1852739 reported by Nicolas Pochet
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Calico Charm
Fix Released
Medium
Joseph Borg
Containerd Subordinate Charm
Fix Released
Medium
Joseph Borg

Bug Description

When trying to deploy CDK with Calico in an environment with proxy, the calico units are in error with the following status:
Unit Workload Agent Machine Public address Ports Message
kubernetes-worker/0 waiting idle 11 10.40.69.147 Waiting for cluster DNS.
  calico/0* error idle 10.40.69.147 hook failed: "install"

When running debug-log on that unit:
unit-calico-0: 12:23:42 ERROR unit.calico/0.juju-log Hook error:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-calico-0/.venv/lib/python3.6/site-packages/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "/var/lib/juju/agents/unit-calico-0/.venv/lib/python3.6/site-packages/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "/var/lib/juju/agents/unit-calico-0/.venv/lib/python3.6/site-packages/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "/var/lib/juju/agents/unit-calico-0/.venv/lib/python3.6/site-packages/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-calico-0/charm/reactive/calico.py", line 595, in pull_calico_node_image
    CTL.pull(image)
  File "/var/lib/juju/agents/unit-calico-0/.venv/lib/python3.6/site-packages/conctl/containerd.py", line 118, in pull
    return self._exec(*args)
  File "/var/lib/juju/agents/unit-calico-0/.venv/lib/python3.6/site-packages/conctl/containerd.py", line 25, in _exec
    return super()._exec(*['ctr'] + list(args))
  File "/var/lib/juju/agents/unit-calico-0/.venv/lib/python3.6/site-packages/conctl/base.py", line 29, in _exec
    return sub_run(args, stdout=PIPE, stderr=PIPE, check=True)
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '('ctr', 'image', 'pull', 'rocks.canonical.com:443/cdk/calico/node:v3.6.1')' returned non-zero exit status 1.

unit-calico-0: 12:23:42 DEBUG unit.calico/0.install Traceback (most recent call last):
unit-calico-0: 12:23:42 DEBUG unit.calico/0.install File "/var/lib/juju/agents/unit-calico-0/charm/hooks/install", line 22, in <module>
unit-calico-0: 12:23:42 DEBUG unit.calico/0.install main()
unit-calico-0: 12:23:42 DEBUG unit.calico/0.install File "/var/lib/juju/agents/unit-calico-0/.venv/lib/python3.6/site-packages/charms/reactive/__init__.py", line 74, in main
unit-calico-0: 12:23:42 DEBUG unit.calico/0.install bus.dispatch(restricted=restricted_mode)
unit-calico-0: 12:23:42 DEBUG unit.calico/0.install File "/var/lib/juju/agents/unit-calico-0/.venv/lib/python3.6/site-packages/charms/reactive/bus.py", line 390, in dispatch
unit-calico-0: 12:23:42 DEBUG unit.calico/0.install _invoke(other_handlers)
unit-calico-0: 12:23:42 DEBUG unit.calico/0.install File "/var/lib/juju/agents/unit-calico-0/.venv/lib/python3.6/site-packages/charms/reactive/bus.py", line 359, in _invoke
unit-calico-0: 12:23:42 DEBUG unit.calico/0.install handler.invoke()
unit-calico-0: 12:23:42 DEBUG unit.calico/0.install File "/var/lib/juju/agents/unit-calico-0/.venv/lib/python3.6/site-packages/charms/reactive/bus.py", line 181, in invoke
unit-calico-0: 12:23:42 DEBUG unit.calico/0.install self._action(*args)
unit-calico-0: 12:23:42 DEBUG unit.calico/0.install File "/var/lib/juju/agents/unit-calico-0/charm/reactive/calico.py", line 595, in pull_calico_node_image
unit-calico-0: 12:23:42 DEBUG unit.calico/0.install CTL.pull(image)

The bundle contains, as a config for the containerd charm, the necessary proxy options:

containerd:
    charm: cs:~containers/containerd
    options:
      http_proxy: 'http://MAAS_VIP:8000'
      https_proxy: 'http://MAAS_VIP:8000'
      no_proxy: '.customerdomain'

When ssh'ing into the unit, I can see that the proxy.conf file for the containerd service is properly configured:

cat /etc/systemd/system/containerd.service.d/proxy.conf
[Service]
Environment="HTTP_PROXY=http://MAAS_VIP:8000/" "HTTPS_PROXY=http://MAAS_VIP:8000/" "NO_PROXY=.customerdomain"

If I try to use ctr to pull the image, it fails with:

sudo ctr --debug image pull rocks.canonical.com:443/cdk/calico/node:v3.6.1
DEBU[0000] fetching image="rocks.canonical.com:443/cdk/calico/node:v3.6.1"
DEBU[0000] resolving
DEBU[0000] do request request.headers=map[Accept:[application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, *]] request.method=HEAD url="https://rocks.canonical.com:443/v2/cdk/calico/node/manifests/v3.6.1"
ctr: failed to resolve reference "rocks.canonical.com:443/cdk/calico/node:v3.6.1": failed to do request: Head https://rocks.canonical.com:443/v2/cdk/calico/node/manifests/v3.6.1: dial tcp 162.213.33.224:443: i/o timeout

When I set the environment variables as root, pulling the images with ctr succeeds:

sudo -i root@juju-9ed549-kubernetes-11:~# export HTTPS_PROXY=http://MAAS_VIP:8000
root@juju-9ed549-kubernetes-11:~# ctr --debug image pull rocks.canonical.com:443/cdk/calico/node:v3.6.1 DEBU[0000] fetching image="rocks.canonical.com:443/cdk/calico/node:v3.6.1" DEBU[0000] resolving DEBU[0000] do request request.headers=map[Accept:[application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, *]] request.method=HEAD url="https://rocks.canonical.com:443/v2/cdk/calico/node/manifests/v3.6.1"
DEBU[0000] fetch response received response.headers=map[Etag:["sha256:8483357e5e8226f9bcf340106ec50294dfa162bf56e95ea3fba5bc21de8e114f"] X-Content-Type-Options:[nosniff] Date:[Fri, 15 Nov 2019 12:32:04 GMT] Set-Cookie:[SRVNAME=; Expires=Thu, 01-Jan-1970 00:00:01 GMT; path=/] Content-Length:[1054] Content-Type:[application/vnd.docker.distribution.manifest.list.v2+json] Docker-Content-Digest:[sha256:8483357e5e8226f9bcf340106ec50294dfa162bf56e95ea3fba5bc21de8e114f] Docker-Distribution-Api-Version:[registry/2.0]] status="200 OK" url="https://rocks.canonical.com:443/v2/cdk/calico/node/manifests/v3.6.1" DEBU[0000] resolved desc.digest=sha256:8483357e5e8226f9bcf340106ec50294dfa162bf56e95ea3fba5bc21de8e114f
...

Revision history for this message
Nicolas Pochet (npochet) wrote :
Revision history for this message
Nicolas Pochet (npochet) wrote :

Subscribed field-critical as it is blocking a customer deployment

Joseph Borg (joeborg)
Changed in charm-containerd:
assignee: nobody → Joseph Borg (joeborg)
importance: Undecided → Critical
status: New → In Progress
Revision history for this message
Joseph Borg (joeborg) wrote :

Spoken with Nicolas on IRC. The workaround, at the moment, is to attach a private docker registry. I will work on a patch to allow the calico charm to read from juju-http-proxy, juju-https-proxy and juju-no-proxy for the longer term fix.

Revision history for this message
Joseph Borg (joeborg) wrote :

On reflection, it would be easier to just attach the calico-node-image to the deployed charm.

Revision history for this message
Nicolas Pochet (npochet) wrote :

The charm suggested by joeborg is working.
It is necessary to pull and save the image:
docker pull rocks.canonical.com:443/cdk/calico/node:v3.6.1
docker save rocks.canonical.com:443/cdk/calico/node:v3.6.1 | gzip > calico-node.tar.gz

And attach it as a resource:
  calico:
    charm: cs:~containers/calico
    options:
      cidr: *calico-cidr
      ipip: 'Always'
    resources:
      calico-node-image: '../images/calico-node.tar.gz'

On the deployment, it was also necessary to remove the calico application and re-apply the bundle.

Revision history for this message
Nicolas Pochet (npochet) wrote :

Reducing the severity to field-medium as the work-around given by joeborg is working.

Revision history for this message
Joseph Borg (joeborg) wrote :

Thanks for the update Nicolas, glad to hear it's working.

Changed in charm-containerd:
importance: Critical → Medium
Revision history for this message
Joseph Borg (joeborg) wrote :
Joseph Borg (joeborg)
Changed in charm-containerd:
status: In Progress → Fix Released
Changed in charm-calico:
status: New → Fix Released
Changed in charm-containerd:
status: Fix Released → Fix Committed
Changed in charm-calico:
status: Fix Released → Fix Committed
importance: Undecided → Medium
assignee: nobody → Joseph Borg (joeborg)
Changed in charm-calico:
milestone: none → 1.17
Changed in charm-containerd:
milestone: none → 1.17
Changed in charm-calico:
status: Fix Committed → Fix Released
Changed in charm-containerd:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.