time out on kube_masters creation

Bug #1770349 reported by masha atakova
28
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Magnum
New
Undecided
Unassigned

Bug Description

Openstack Queens on CentOS, Magnum is always failing cluster creation with the following errors:

{u'enable_prometheus_monitoring_deployment': u'CREATE aborted (Task create from SoftwareDeployment "enable_prometheus_monitoring_deployment" Stack "kubernetes-cluster-gsllydg6tqcl-kube_masters-exq6am3fngts-0-xeu4xrscjyqu" [77ef867c-e356-4091-a544-9761c114cfcd] Timed out)', u'kube_masters': u'CREATE aborted (Task create from ResourceGroup "kube_masters" Stack "kubernetes-cluster-gsllydg6tqcl" [5315420c-58b5-4278-a0cf-e76762fa6ed1] Timed out)', u'calico_service_deployment': u'CREATE aborted (Task create from SoftwareDeployment "calico_service_deployment" Stack "kubernetes-cluster-gsllydg6tqcl-kube_masters-exq6am3fngts-0-xeu4xrscjyqu" [77ef867c-e356-4091-a544-9761c114cfcd] Timed out)', u'0': u'resources[0]: Stack CREATE cancelled', u'enable_cert_manager_api_deployment': u'CREATE aborted (Task create from SoftwareDeployment "enable_cert_manager_api_deployment" Stack "kubernetes-cluster-gsllydg6tqcl-kube_masters-exq6am3fngts-0-xeu4xrscjyqu" [77ef867c-e356-4091-a544-9761c114cfcd] Timed out)', u'kubernetes_dashboard_deployment': u'CREATE aborted (Task create from SoftwareDeployment "kubernetes_dashboard_deployment" Stack "kubernetes-cluster-gsllydg6tqcl-kube_masters-exq6am3fngts-0-xeu4xrscjyqu" [77ef867c-e356-4091-a544-9761c114cfcd] Timed out)', u'enable_ingress_controller_deployment': u'CREATE aborted (Task create from SoftwareDeployment "enable_ingress_controller_deployment" Stack "kubernetes-cluster-gsllydg6tqcl-kube_masters-exq6am3fngts-0-xeu4xrscjyqu" [77ef867c-e356-4091-a544-9761c114cfcd] Timed out)'}

I looked inside the master VM and saw this in the logs:

/var/log/cloud-init.log:cloudinit.util.ProcessExecutionError: Unexpected error while running command.
/var/log/cloud-init.log-Command: ['/var/lib/cloud/instance/scripts/part-009']
/var/log/cloud-init-output.log- New size given (1280 extents) not larger than existing size (9983 extents)
/var/log/cloud-init-output.log:ERROR: There is not enough free space in volume group atomicos to create data volume of size MIN_DATA_SIZE=2

# lvs
  LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
  root atomicos -wi-ao---- <39.00g

In my Pike setup it's different, but I see that a lot of things changed since then:

# lvs
  LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
  docker-pool atomicos twi-a-t--- 13.95g 0.00 0.16
  root atomicos -wi-ao---- 5.00g

The cluster was created exactly as written in manual with the only difference that I tried medium flavor for the master node in case it's a free space problem.

# yum list | grep magnum
openstack-magnum-api.noarch 6.1.0-1.el7 @centos-openstack-queens
openstack-magnum-common.noarch 6.1.0-1.el7 @centos-openstack-queens
openstack-magnum-conductor.noarch 6.1.0-1.el7 @centos-openstack-queens
openstack-magnum-ui.noarch 4.0.0-1.el7 @centos-openstack-queens
python-magnum.noarch 6.1.0-1.el7 @centos-openstack-queens

Revision history for this message
Gildas Cherruel (gildas) wrote :

Noticed this happens for both Kubernetes and Swarm coes.

Revision history for this message
Christian Zunker (christian-zunker) wrote :

I had the same problem with a k8s cluster. In my case, it turned out, this was the problem:
https://bugs.launchpad.net/magnum/+bug/1744362

These two helped me fix the problem:
https://ask.openstack.org/en/question/102214/software-deployment-in-heat-problem-with-os-collect-config/
https://bugs.launchpad.net/kolla-ansible/+bug/1762754

The problem was actually in heat config, not magnum.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.