meta-data service fails/unreliable in the production cluster

Bug #1410947 reported by Ananth Suryanarayana
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Incomplete
High
Unassigned

Bug Description

Some of the CI jobs failures are due to meta-data service failure. It was found that keystone (which provides the meta-data service) is stuck at 100% cpu in the openstack node in production cluster

e.g.

ubuntu-build03 /users/anantha> sshpass -p c0ntrail123 ssh -q root@10.84.26.14 top -n1 -b | \grep keystone
13497 keystone 20 0 211m 55m 5820 R 96 0.0 10:55.14 keystone-all
ubuntu-build03 /users/anantha> sshpass -p c0ntrail123 ssh -q root@10.84.26.14 top -n1 -b | \grep keystone
13497 keystone 20 0 211m 55m 5820 R 97 0.0 10:56.72 keystone-all
ubuntu-build03 /users/anantha> sshpass -p c0ntrail123 ssh -q root@10.84.26.14 top -n1 -b | \grep keystone
13497 keystone 20 0 211m 55m 5820 R 100 0.0 10:57.87 keystone-all
ubuntu-build03 /users/anantha>

Cluster is running with 1.20-63

Affected jobs usually ends up with default host name ci-oc-slave which is incorrect. Ideally it should get the correct host-name as retrieved from the meta-data service. I have seen meta-data query failing even after a minute or so.

tags: added: ci
information type: Proprietary → Public
tags: added: openstack
tags: added: keystone
Nischal Sheth (nsheth)
Changed in juniperopenstack:
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.