Bare Metal Deployment Guide for kolla-kubernetes in kolla-kubernetes

Bug #1748804 reported by Sliverman69-8
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla-kubernetes
New
Undecided
Unassigned

Bug Description

This bug tracker is for errors with the documentation, use the following as a template and remove or add fields as you see fit. Convert [ ] into [x] to check boxes:

- [x] This doc is inaccurate in this way: When setting up kubernetes, canal fails at 2/3 and throws the following errors:
# kubectl logs canal-jdpd9 --container calico-node --namespace=kube-system
Checking datastore connection
Datastore connection verified
ERROR: Unable to set node resource configuration: Failed to ensure ThirdPartyResources exist: resource does not exist: {{ } {global-config.projectcalico.org kube-system %!s(int64=0) 0001-01-01 00:00:00 +0000 UTC <nil> %!s(*int64=<nil>) map[] map[] [] [] } Calico Global Configuration [{v1}]}
Terminating
time="2018-02-12T00:26:05Z" level=error msg="Hit error initializing TPR" error="resource does not exist: {{ } {global-bgp-config.projectcalico.org kube-system %!s(int64=0) 0001-01-01 00:00:00 +0000 UTC <nil> %!s(*int64=<nil>) map[] map[] [] [] } Calico Global BGP Configuration [{v1}]}"
time="2018-02-12T00:26:05Z" level=error msg="Hit error initializing TPR" error="resource does not exist: {{ } {system-network-policy.alpha.projectcalico.org kube-system %!s(int64=0) 0001-01-01 00:00:00 +0000 UTC <nil> %!s(*int64=<nil>) map[] map[] [] [] } Calico System Network Policies [{v1}]}"
time="2018-02-12T00:26:05Z" level=error msg="Hit error initializing TPR" error="resource does not exist: {{ } {global-bgp-peer.projectcalico.org kube-system %!s(int64=0) 0001-01-01 00:00:00 +0000 UTC <nil> %!s(*int64=<nil>) map[] map[] [] [] } Calico Global BGP Peers [{v1}]}"
time="2018-02-12T00:26:05Z" level=error msg="Hit error initializing TPR" error="resource does not exist: {{ } {ip-pool.projectcalico.org kube-system %!s(int64=0) 0001-01-01 00:00:00 +0000 UTC <nil> %!s(*int64=<nil>) map[] map[] [] [] } Calico IP Pools [{v1}]}"
time="2018-02-12T00:26:05Z" level=error msg="Hit error initializing TPR" error="resource does not exist: {{ } {global-config.projectcalico.org kube-system %!s(int64=0) 0001-01-01 00:00:00 +0000 UTC <nil> %!s(*int64=<nil>) map[] map[] [] [] } Calico Global Configuration [{v1}]}"
Calico node failed to start

For some reason, it seems that this is due to failing to set resource configuration due to global-config.projectcalico.org not existing anywhere. It also looks like the instructions here may be out of date from the calico docs:
https://docs.projectcalico.org/v2.6/getting-started/kubernetes/

It appears that the subdomains may not exist, which would mean that it's not pulling a config since they're empty. I think this part of the canal/calico config need to be updated to reflect this. Here's a dig of the domain:
$ dig global-config.projectcalico.org

; <<>> DiG 9.9.7-P3 <<>> global-config.projectcalico.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 32515
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;global-config.projectcalico.org. IN A

;; AUTHORITY SECTION:
projectcalico.org. 300 IN SOA ns-cloud-b1.googledomains.com. hostmaster.projectcalico.org. 1 21600 3600 1209600 300

;; Query time: 59 msec
;; SERVER: 2601:601:9580:1390:21b:21ff:febb:e7ac#53(2601:601:9580:1390:21b:21ff:febb:e7ac)
;; WHEN: Sun Feb 11 17:45:23 PST 2018
;; MSG SIZE rcvd: 136

- [ ] This is a doc addition request.
- [ ] I have a fix to the document that I can paste below including example: input and output.

If you have a troubleshooting or support issue, use the following resources:

 - Ask OpenStack: http://ask.openstack.org
 - The mailing list: http://lists.openstack.org
 - IRC: 'openstack' channel on Freenode

-----------------------------------
Release: on 2018-02-07 21:56
SHA: d434f4a41ee2806f4c19d2e8c7d6f4799e6d48ef
Source: https://git.openstack.org/cgit/openstack/kolla-kubernetes/tree/doc/source/deployment-guide.rst
URL: https://docs.openstack.org/kolla-kubernetes/latest/deployment-guide.html

Revision history for this message
Sliverman69-8 (sliverman69-8) wrote :

I found the way to fix/work around this issue. It turns out that you need to do the following steps to move forward:
1.) remove docker version 1.17.05 or 1.12 and install 17.03:
# yum remove docker*
# yum instally -y docker-engine-17.03*
2.) install the kubernetes packages:
# yum install -y ebtables kubeadm kubectl kubelet kubernetes-cni git gcc
3.) validate that you are doing this:
kubelets must be started with --network-plugin=cni and have --cni-conf-dir and --cni-bin-dir properly set in your 10-kubeadm.conf file:
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
*Not sure if this is different from what is automatically installed via yum or not as I was monkeying with all the settings, but I removed the --network-plugin for a while and it caused problems, so I put the settings from the below link.
4.) follow the steps for canal on github for version 1.7+ from here with the modifications in the kolla-kubernetes install guide:
https://github.com/projectcalico/canal/tree/master/k8s-install#for-kubernetes-17
$ kubectl apply -f https://raw.githubusercontent.com/projectcalico/canal/master/k8s-install/1.7/rbac.yaml
$ curl -L -s https://raw.githubusercontent.com/projectcalico/canal/master/k8s-install/1.7/canal.yaml -o canal.yaml
$ sed -i "s@10.244.0.0/16@10.1.0.0/16@" canal.yaml
$ kubectl apply -f canal.yaml
5.) Once you've completed the above, your canal should get to 3/3 and start passing dns checks as well (though DNS would pass before this).

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.