apiserver can't access heapster

Bug #1757936 reported by PanFengyun
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Magnum
Status tracked in Rocky
Queens
Fix Committed
Undecided
Spyros Trigazis
Rocky
Fix Released
Undecided
Spyros Trigazis

Bug Description

kubernetes HPA can't work:
---
  Warning FailedGetResourceMetric 3m (x226 over 16h) horizontal-pod-autoscaler unable to get metrics for resource cpu: failed to get pod resource metrics: an error on the server ("Error: 'dial tcp 10.100.93.2:8082: connect: connection timed out'\nTrying to reach: 'http://10.100.93.2:8082/apis/metrics/v1alpha1/namespaces/default/pods?labelSelector=run%3Dphp-apache'") has prevented the request from succeeding (get services http:heapster:)
---
the log mean that kube-apiserver can't access heapster, but I cant access heapster on minion node.
---
[root@cl-ptct2aceh-0-lrtdb33taapq-kube-master-uxythaeaxs7e dashboard]# curl 10.100.93.6:8082
^C
[root@cl-ptct2aceh-0-lrtdb33taapq-kube-master-uxythaeaxs7e dashboard]# ping 10.100.93.9
PING 10.100.93.9 (10.100.93.9) 56(84) bytes of data.
^C
--- 10.100.93.9 ping statistics ---
7 packets transmitted, 0 received, 100% packet loss, time 6132ms

[root@cl-gj6xa5wxt-0-qjzkktc2bwdx-kube-minion-lnpu6xqe5qp2 ~]# curl http://10.100.93.6:8082/apis/metrics/v1alpha1/namespaces/default/pods?labelSelector=run%3Dphp-apache
{
  "metadata": {},
  "items": [
   {
    "metadata": {
     "name": "php-apache-757c76488d-cmrmj",
     "namespace": "default",
     "creationTimestamp": "2018-03-22T04:22:21Z"
    },
    "timestamp": "2018-03-22T04:22:00Z",
    "window": "1m0s",
    "containers": [
     {
      "name": "php-apache",
      "usage": {
       "cpu": "0",
       "memory": "14720Ki"
      }
     }
    ]
   }
  ]
 }
---
So I think the network of cluster has issuse:
If there is no flannel or calico service on master, how kube-apiserver access heapster?

cc @Spyros Trigazis

Revision history for this message
PanFengyun (pan-feng-yun) wrote :

I think we should specify hostnetwork=true for heapster.
I have test it:
[root@cl-ptct2aceh-0-lrtdb33taapq-kube-master-uxythaeaxs7e dashboard]# kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache 0% / 20% 1 10 1 18h
php-apache2 Deployment/php-apache2 0% / 20% 1 10 1 18h

do you agree?

Changed in magnum:
assignee: nobody → PanFengyun (pan-feng-yun)
status: New → In Progress
Revision history for this message
Spyros Trigazis (strigazi) wrote :

Let's add flannel back on the master node to have access with "kubectl proxy" as well. I have tested this option and works well.

Enabling a hostnetwork will give access to anyone in the cluster to heapster.

Revision history for this message
PanFengyun (pan-feng-yun) wrote :

Ok, I agree.

Revision history for this message
PanFengyun (pan-feng-yun) wrote :

OTOH, I met another issuse about heapster:
E0326 10:15:05.001773 1 kubelet.go:271] No nodes received from APIserver.
E0326 10:15:07.423956 1 reflector.go:190] k8s.io/heapster/metrics/util/util.go:51: Failed to list *v1.Node: Get https://kubernetes.default/api/v1/nodes?resourceVersion=0: dial tcp: lookup kubernetes.default on 8.8.8.8:53: no such host
E0326 10:15:07.424156 1 reflector.go:190] k8s.io/heapster/metrics/processors/namespace_based_enricher.go:84: Failed to list *v1.Namespace: Get https://kubernetes.default/api/v1/namespaces?resourceVersion=0: dial tcp: lookup kubernetes.default on 8.8.8.8:53: no such host
E0326 10:15:07.424216 1 reflector.go:190] k8s.io/heapster/metrics/util/util.go:51: Failed to list *v1.Node: Get https://kubernetes.default/api/v1/nodes?resourceVersion=0: dial tcp: lookup kubernetes.default on 8.8.8.8:53: no such host
E0326 10:15:07.424306 1 reflector.go:190] k8s.io/heapster/metrics/heapster.go:322: Failed to list *v1.Pod: Get https://kubernetes.default/api/v1/pods?resourceVersion=0: dial tcp: lookup kubernetes.default on 8.8.8.8:53: no such host
E0326 10:15:07.424373 1 reflector.go:190] k8s.io/heapster/metrics/util/util.go:51: Failed to list *v1.Node: Get https://kubernetes.default/api/v1/nodes?resourceVersion=0: dial tcp: lookup kubernetes.default on 8.8.8.8:53: no such host

have you meet it?

Revision history for this message
PanFengyun (pan-feng-yun) wrote :

Oh, I konw it, we should use "--source=kubernetes.summary_api:''".
-----
- - --source=kubernetes:https://kubernetes.default
+ - --source=kubernetes.summary_api:''

Revision history for this message
PanFengyun (pan-feng-yun) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on magnum (master)

Change abandoned by PanFengyun (pan_feng_yun@163.com) on branch: master
Review: https://review.openstack.org/555154

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to magnum (master)

Fix proposed to branch: master
Review: https://review.openstack.org/558836

Changed in magnum:
assignee: nobody → Spyros Trigazis (strigazi)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to magnum (master)

Reviewed: https://review.openstack.org/558836
Committed: https://git.openstack.org/cgit/openstack/magnum/commit/?id=405b0c20288c170fd8e4dac8a3296602564db1c8
Submitter: Zuul
Branch: master

commit 405b0c20288c170fd8e4dac8a3296602564db1c8
Author: Spyros Trigazis <email address hidden>
Date: Wed Apr 4 14:13:36 2018 +0000

    k8s_fedora: Add flannel to master nodes

    To allow ther api server access pods, we need
    flannel to be running on the master node.
    * Run flannel on the master node in a system
      container.

    Change-Id: Ic0996ba36e335e970f3d2255840b24a8b4f738b8
    Closes-Bug: #1757936

Changed in magnum:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to magnum (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/564199

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to magnum (stable/queens)

Reviewed: https://review.openstack.org/564199
Committed: https://git.openstack.org/cgit/openstack/magnum/commit/?id=1382e6f378eba6770588e331fb2b13ece4997e8d
Submitter: Zuul
Branch: stable/queens

commit 1382e6f378eba6770588e331fb2b13ece4997e8d
Author: Spyros Trigazis <email address hidden>
Date: Wed Apr 4 14:13:36 2018 +0000

    k8s_fedora: Add flannel to master nodes

    To allow ther api server access pods, we need
    flannel to be running on the master node.
    * Run flannel on the master node in a system
      container.

    Change-Id: Ic0996ba36e335e970f3d2255840b24a8b4f738b8
    Closes-Bug: #1757936
    (cherry picked from commit 405b0c20288c170fd8e4dac8a3296602564db1c8)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/magnum 6.1.1

This issue was fixed in the openstack/magnum 6.1.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/magnum 7.0.0

This issue was fixed in the openstack/magnum 7.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.