ocata deploy fails randomly with ceilo upgrade failing with keystone
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
High
|
Emilien Macchi |
Bug Description
Description of problem:
Randomly deployment fails in overcloud.
Error: ceilometer-upgrade --skip-
If you re-run the deployment command it will resume from where it failed and finish successfully. It appears that Keystone is not running when it's needed by ceilometer. ceilo-upgrade tries to authenticate with keystone to create resource types in gnocchi. But keystone throws a 503:
10.35.191.20 - - [02/Jul/
/v1/resource_
"ceilometer-upgrade keystoneauth1/
CPython/2.7.5"
2017-07-02 14:48:11.800 116449 ERROR keystonemiddlew
Bad response code while validating token: 503
2017-07-02 14:48:11.800 116449 WARNING keystonemiddlew
[-] Identity response: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
and ceilometer-upgrade fails with:
2017-07-02 14:48:11.803 123807 CRITICAL ceilometer [-]
ClientException: {"message": "The server is currently unavailable.
Please try again at a later time.<br /><br />\n\n\n", "code": "503
Service Unavailable", "title": "Service Unavailable"} (HTTP 503)
How reproducible:
randomly
Steps to Reproduce:
1. Deploy the default plan. I deployed a topology of 3 controllers + 1 compute + 3 ceph with this command: openstack overcloud deploy --templates --ntp-server clock.redhat.com -e /usr/share/
2. If you hit this error you re-run the command to resume it from where it failed, and it passes the 2nd time.
Expected results:
Deployment should pass the first time. You shouldn't hit errors due to keystone not being up and running on time.
Changed in tripleo: | |
importance: | Undecided → High |
status: | New → Triaged |
Changed in tripleo: | |
milestone: | none → pike-3 |
tags: | added: ocata-backport-potential |
Changed in tripleo: | |
assignee: | Alex Schultz (alex-schultz) → Emilien Macchi (emilienm) |
Pradeep, I think we have the same issue here
https:/ /thirdparty- logs.rdoproject .org/jenkins- tripleo- quickstart- ocata-rdo_ trunk-baremetal -dell_pe_ r630-bond_ with_vlans- 221/undercloud/ home/stack/ failed_ deployments. log.gz