unable to load server certificate: open /etc/kubernetes/certs/server.crt

Bug #1751409 reported by Yufen Kuo
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Magnum
New
Undecided
Sama Madhu Sudhan Reddy

Bug Description

when creating kubernetes cluster, it would timeout with faults:
| faults | {'0': 'resources[0]: Stack CREATE cancelled', 'enable_prometheus_monitoring_deployment': 'CREATE aborted (Task create from SoftwareDeployment "enable_prometheus_monitoring_deployment" Stack "k8s-cluster-1-7ejyiqff5xqg-kube_masters-ie3yd4p4w5ep-0-5tcyaeemtt4w" [122f22c9-666c-4a03-9b1c-958ca45b834a] Timed out)', 'kube_masters': 'CREATE aborted (Task create from ResourceGroup "kube_masters" Stack "k8s-cluster-1-7ejyiqff5xqg" [dbe2e1e0-3881-4ddd-a2b1-f9e4d277f126] Timed out)', 'master_wait_condition': 'CREATE aborted (Task create from HeatWaitCondition "master_wait_condition" Stack "k8s-cluster-1-7ejyiqff5xqg-kube_masters-ie3yd4p4w5ep-0-5tcyaeemtt4w" [122f22c9-666c-4a03-9b1c-958ca45b834a] Timed out)'} |
| api_address | http://:8080

# magnum --version
2.7.0

When ssh into the master nodeto debug, checking cloud-init-output.log file, it is hanging at waiting for Kubernetes API... line:

Created symlink /etc/systemd/system/multi-user.target.wants/kube-system-namespace.service → /etc/systemd/system/kube-system-namespace.service.
Writing File: /etc/kubernetes/manifests/kube-coredns.yaml
Waiting for Kubernetes API...

[fedora@k8s-cluster-3-nmkmu36k6i7a-master-0 etc]$ cat /etc/fedora-release
Fedora release 25 (Twenty Five)

upon checking kube-apiserver, it seems it failed to start due to missing certificate.

[fedora@k8s-cluster-3-nmkmu36k6i7a-master-0 log]$ systemctl status kube-apiserver
● kube-apiserver.service - kubernetes-apiserver
   Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled)
   Active: failed (Result: start-limit-hit) since Sat 2018-02-24 02:58:05 UTC; 14s ago
  Process: 3148 ExecStop=/bin/runc --systemd-cgroup kill kube-apiserver (code=exited, status=1/FAILURE)
  Process: 3121 ExecStart=/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE)
 Main PID: 3121 (code=exited, status=1/FAILURE)

Feb 24 02:58:04 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Unit entered failed state.
Feb 24 02:58:04 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'.
Feb 24 02:58:05 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Service hold-off time over, scheduling restart.
Feb 24 02:58:05 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver.
Feb 24 02:58:05 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly.
Feb 24 02:58:05 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.
Feb 24 02:58:05 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Unit entered failed state.
Feb 24 02:58:05 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'start-limit-hit'.
[fedora@k8s-cluster-3-nmkmu36k6i7a-master-0 log]$ sudo journalctl -u kube-apiserver
-- Logs begin at Sat 2018-02-24 02:56:15 UTC, end at Sat 2018-02-24 02:58:47 UTC. --
Feb 24 02:58:01 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: Started kubernetes-apiserver.
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal runc[2204]: I0224 02:58:02.052311 1 server.go:112] Version: v1.7.4
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal runc[2204]: W0224 02:58:02.053602 1 authentication.go:368] AnonymousAuth is not allowed with the AllowAll
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal runc[2204]: W0224 02:58:02.053841 1 server.go:610] No TLS key provided, service account token authenticati
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal runc[2204]: unable to load server certificate: open /etc/kubernetes/certs/server.crt: no such file or directory
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal runc[2704]: container "kube-apiserver" does not exist
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Control process exited, code=exited status=1
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Unit entered failed state.
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'.
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Service hold-off time over, scheduling restart.
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver.
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: Started kubernetes-apiserver.
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal runc[2800]: I0224 02:58:02.932726 1 server.go:112] Version: v1.7.4
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal runc[2800]: W0224 02:58:02.933510 1 authentication.go:368] AnonymousAuth is not allowed with the AllowAll
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal runc[2800]: W0224 02:58:02.933767 1 server.go:610] No TLS key provided, service account token authenticati
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal runc[2800]: unable to load server certificate: open /etc/kubernetes/certs/server.crt: no such file or directory
Feb 24 02:58:02 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE
Feb 24 02:58:03 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal runc[2956]: container "kube-apiserver" does not exist
Feb 24 02:58:03 k8s-cluster-3-nmkmu36k6i7a-master-0.novalocal systemd[1]: kube-apiserver.service: Control process exited, code=exited status=1
[fedora@k8s-cluster-3-nmkmu36k6i7a-master-0 log]$ ls -l /etc/kubernetes/certs/
total 0

Revision history for this message
Yufen Kuo (ykuo) wrote :

# magnum cluster-template-create --name k8s-1 --image fedora-atomic-latest --keypair mykey --external-network provider --dns-nameserver 10.40.0.5 --master-flavor m1.large --flavor m1.large --coe kubernetes --network-driver flannel

# magnum cluster-create --name k8s-cluster-3 --cluster-template k8s-1 --master-count 1 --node-count 1

Changed in magnum:
assignee: nobody → Sama Madhu Sudhan Reddy (smsreddy)
Revision history for this message
Spyros Trigazis (strigazi) wrote :

With which magnum release you have this issue?
Which fedora atomic image are you using?

Revision history for this message
Yufen Kuo (ykuo) wrote :

I was using pike when I have this issue.

After upgrading to Queens and apply this patch https://github.com/openstack/magnum/commit/5a34d7d830ad4b6a714f079d4575e1705df434f3#diff-5149fbecc3eedea0e14cb30ed34b82b1
the issue is no longer reproducible.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.