validation.py::test_dashboard - TimeoutError: Unable to reach dashboard

Bug #1863986 reported by John George
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Charmed Kubernetes Testing
Incomplete
Undecided
Joseph Borg

Bug Description

The test_dashboard test from https://github.com/charmed-kubernetes/jenkins fails
Solutions QA artifacts can be found here:
https://solutions.qa.canonical.com/#/qa/testRun/56e2ba4d-7f51-43d7-869f-150acfb293dd

The console log from a manual recreate is attached.

Revision history for this message
John George (jog) wrote :
description: updated
Revision history for this message
John George (jog) wrote :

The dashboard is running:

$ /snap/bin/kubectl get po --kubeconfig=./generated/kubernetes/kube.conf -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-cc4c766b4-5st8p 1/1 Running 0 21h
kubernetes-dashboard-7cf54d76b5-r8k8g 1/1 Running 0 21h

It's returning a 401:
wget --no-check-certificate 'https://10.0.1.15:443/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/#!/login'
--2020-02-20 18:06:58-- https://10.0.1.15/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
Connecting to 10.0.1.15:443... connected.
WARNING: cannot verify 10.0.1.15's certificate, issued by ‘CN=Vault Root Certificate Authority (charm-pki-local)’:
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 401 Unauthorized

Revision history for this message
Kevin W Monroe (kwmonroe) wrote :

Comment #2 is returning a 401 because there isn't any auth data in the wget invocation. Try it with --user admin --password $your_password, and I think you'll get a 200 response.

Based on the attachment in comment #1, I see 503 being returned:

DEBUG urllib3.connectionpool:connectionpool.py:396 https://10.0.1.15:443 "GET /api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/ HTTP/1.1" 503 72

Is there any way to see the load on the system when this failed? Perhaps the system was busy and we reached the test timeout before the dashboard was ready.

Revision history for this message
John George (jog) wrote :
Download full text (4.9 KiB)

I've manually tried to verify the dashboard is available following the steps described with the kubernetes-calico bundle. As seen below a connection to the calico cidr IP assigned to the kubernetes-dashboard-7cf54d76b5-r8k8g pod times out.

ubuntu@anorith:~$ kubectl proxy
Starting to serve on 127.0.0.1:8001
^Z
[1]+ Stopped kubectl proxy
ubuntu@anorith:~$ bg
[1]+ kubectl proxy &

ubuntu@anorith:~$ nc -nz 127.0.0.1 8001
ubuntu@anorith:~$ echo $?
0

ubuntu@anorith:~$ curl 'http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/#!/login'
Error trying to reach service: 'dial tcp 172.16.1.199:8443: i/o timeout'

ubuntu@anorith:~$ kubectl describe po -n kubernetes-dashboard
Name: dashboard-metrics-scraper-cc4c766b4-5st8p
Namespace: kubernetes-dashboard
Priority: 0
Node: juju-cd5864-kubernetes-9/172.16.0.185
Start Time: Wed, 19 Feb 2020 20:46:50 +0000
Labels: k8s-app=dashboard-metrics-scraper
              pod-template-hash=cc4c766b4
Annotations: <none>
Status: Running
IP: 172.16.1.201
IPs:
  IP: 172.16.1.201
Controlled By: ReplicaSet/dashboard-metrics-scraper-cc4c766b4
Containers:
  dashboard-metrics-scraper:
    Container ID: containerd://810e53b4beb6dc605bd9f52f3fb2e8281515db7cea1fb1400565cd8aecd44e7e
    Image: rocks.canonical.com:443/cdk/kubernetesui/metrics-scraper:v1.0.1
    Image ID: rocks.canonical.com:443/cdk/kubernetesui/metrics-scraper@sha256:3b1cb436dbc2c02aabd7d29e3d9b3f8b4dfc1eb50dbcc63640213ef1139235dd
    Port: 8000/TCP
    Host Port: 0/TCP
    State: Running
      Started: Wed, 19 Feb 2020 20:48:31 +0000
    Ready: True
    Restart Count: 0
    Liveness: http-get http://:8000/ delay=30s timeout=30s period=10s #success=1 #failure=3
    Environment: <none>
    Mounts:
      /tmp from tmp-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kubernetes-dashboard-token-qjphl (ro)
Conditions:
  Type Status
  Initialized True
  Ready True
  ContainersReady True
  PodScheduled True
Volumes:
  tmp-volume:
    Type: EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit: <unset>
  kubernetes-dashboard-token-qjphl:
    Type: Secret (a volume populated by a Secret)
    SecretName: kubernetes-dashboard-token-qjphl
    Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>

Name: kubernetes-dashboard-7cf54d76b5-r8k8g
Namespace: kubernetes-dashboard
Priority: 0
Node: juju-cd5864-kubernetes-9/172.16.0.185
Start Time: Wed, 19 Feb 2020 20:46:50 +0000
Labels: k8s-app=kubernetes-dashboard
              pod-template-hash=7cf54d76b5
Annotations: <none>
Status: Running
IP: 172.16.1.199
IPs:
  IP: 172.16.1.199
Controlled By: ReplicaSet/kubernetes-dashboard-7cf54d76b5
Containe...

Read more...

Joseph Borg (joeborg)
Changed in charmed-kubernetes-testing:
assignee: nobody → Joseph Borg (joeborg)
Revision history for this message
Joseph Borg (joeborg) wrote :

I cannot reproduce this using the same parameters I see on the failed job. Would it be okay to re-run to check it wasn't a one off?

George Kraft (cynerva)
Changed in charmed-kubernetes-testing:
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.