magnum create cluster "create_in_progress" and changes to "create_failed" after timeout
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Magnum |
New
|
Undecided
|
Unassigned |
Bug Description
Issue is observed with all COEs; Kubernetes, swarm and mesos
observed same issue is in both fedora-atomic and ubuntu-mesos images.
magnum --version
2.5.0
Openstack-ansible deployed
DISTRIB_ID="OSA"
DISTRIB_
DISTRIB_
root@utility-
+------
| uuid | name | keypair | node_count | master_count | status |
+------
| b1170284-
| c67e4610-
| 5bb46ecd-
| a70d819f-
| c8e23ae5-
| 314e50be-
+------
root@utility-
| Property | Value -------
| status | CREATE_FAILED |
| cluster_template_id | 9ee0ff65-
| node_addresses | [] |
| uuid | 5bb46ecd-
| stack_id | 75345ad3-
| status_reason | Timed out |
| created_at | 2017-09-
| updated_at | 2017-09-
| coe_version | v1.5.3 |
| faults | {'0': 'resources[0]: Stack CREATE cancelled', 'kube_masters': 'CREATE aborted (Task create from ResourceGroup "kube_masters" Stack "test03-
| keypair | mykey |
| api_address | https:/
| master_addresses | ['10.XX.XX.XX'] |
| create_timeout | 60 |
| node_count | 1 |
| discovery_url | https:/
| master_count | 1 |
| container_version | 1.12.6 |
| name | test03 |
+------
+------
+------
| cb2501bb-
| 6ec23a16-
| 75345ad3-
| 61b1c935-
| b90874f4-
| 9ae3ab0a-
+------
root@utility-
-------
| Property | Value -------
| capabilities | [] |
| creation_time | 2017-09-
| deletion_time | None |
| description | This template will boot a Kubernetes cluster with one |
| | or more minions (as specified by the number_of_minions |
| | parameter, which defaults to 1). |
| disable_rollback | True |
| id | 75345ad3-
| links | https:/
| notification_topics | [] |
| outputs | [ |
| | { |
| | "output_value": [ |
| | "10.20.30.22" |
| | ], |
| | "output_key": "kube_masters_
| | "description": "This is a list of the \"private\" IP addresses of all the Kubernetes masters.\n" |
| | }, |
| | { |
| | "output_value": [ |
| | "10.XX.XX.XX" |
| | ], |
| | "output_key": "kube_masters", |
| | "description": "This is a list of the \"public\" IP addresses of all the Kubernetes masters. Use these IP addresses to log in to the Kubernetes masters via ssh.\n" |
| | }, |
| | { |
| | "output_value": "", |
| | "output_key": "api_address", |
| | "description": "This is the API endpoint of the Kubernetes cluster. Use this to access the Kubernetes API.\n" |
| | }, |
| | { |
| | "output_value": null, |
| | "output_key": "kube_minions_
| | "description": "This is a list of the \"private\" IP addresses of all the Kubernetes minions.\n" |
| | }, |
| | { |
| | "output_value": null, |
| | "output_key": "kube_minions", |
| | "description": "This is a list of the \"public\" IP addresses of all the Kubernetes minions. Use these IP addresses to log in to the Kubernetes minions via ssh." |
| | }, |
| | { |
| | "output_value": "localhost:5000", |
| | "output_key": "registry_address", |
| | "description": "This is the url of docker registry server where you can store docker images." |
| | } |
| | ] |
| parameters | { |
| | "OS::project_id": "0df01d5576b545
| | "fixed_
| | "magnum_url": "https:/
| | "number_
| | "tenant_name": "0df01d5576b545
| | "wait_condition
| | "minion_flavor": "m1.small", |
| | "portal_
| | "auth_url": "https:/
| | "admission_
| | "registry_
| | "cluster_uuid": "5bb46ecd-
| | "kubernetes_port": "6443", |
| | "external_network": "4717b1be-
| | "trustee_
| | "flannel_backend": "udp", |
| | "fixed_subnet": "mysubnet", |
| | "region_name": "RegionOne", |
| | "kube_dashboard
| | "kube_dashboard
| | "no_proxy": "", |
| | "registry_port": "5000", |
| | "kube_version": "v1.5.3", |
| | "minions_
| | "https_proxy": "", |
| | "tls_disabled": "False", |
| | "trust_id": "******", |
| | "volume_driver": "cinder", |
| | "number_
| | "swift_region": "", |
| | "username": "admin", |
| | "http_proxy": "", |
| | "docker_
| | "OS::stack_name": "test03-
| | "system_
| | "insecure_
| | "system_
| | "registry_enabled": "False", |
| | "kube_allow_priv": "true", |
| | "password": "******", |
| | "loadbalancing_
| | "trustee_password": "******", |
| | "docker_
| | "registry_
| | "OS::stack_id": "75345ad3-
| | "registry_
| | "trustee_user_id": "c563b732e1fd41
| | "network_driver": "flannel", |
| | "fixed_network": "mynet", |
| | "master_flavor": "m1.small", |
| | "trustee_username": "5bb46ecd-
| | "ssh_key_name": "mykey", |
| | "flannel_
| | "flannel_
| | "discovery_url": "https:/
| | "dns_nameserver": "10.XX.XX.XX", |
| | "server_image": "fedora-
| | } |
| parent | None |
| stack_name | test03-ikz26ajpwlu3 |
| stack_owner | None |
| stack_status | CREATE_FAILED |
| stack_status_reason | Timed out |
| stack_user_
| tags | null |
| template_
| | or more minions (as specified by the number_of_minions |
| | parameter, which defaults to 1). |
| timeout_mins | 60 |
| updated_time | None |
-------
root@utility-
+------
+------
| api_address_
| api_lb | 6bd8bd1e-
| etcd_address_
| etcd_lb | 1a5d6e23-
| kube_masters | ef36144f-
| kube_minions | | OS::Heat:
| network | a500022c-
| secgroup_
| secgroup_
+------
+------
+------
| | "attributes": null, |
| | "refs": null, |
| | "refs_map": null, |
| | "removed_
| | } |
| creation_time | 2017-09-
| description | |
| links | https:/
| | https:/
| | https:/
| logical_resource_id | kube_masters |
| physical_
| required_by | api_address_
| | etcd_address_
| resource_name | kube_masters |
| resource_status | CREATE_FAILED |
| resource_
| resource_type | OS::Heat:
| updated_time | 2017-09-
+------
If anyone has this issue and has resolved it or knows how the issue can be resolved, please share your thoughts.
if you need more data, let me know.
Thank you.
I'm able to login to the instances created, I see some services on the master node are not running like flanneld, etcd, docker. I tried to start them manually and expected if that will help before the timeout. But nothing helped in creating the magnum cluster.