CI failing on keystone connection error, keepalived issue

Bug #1464719 reported by Ben Nemec
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Dan Prince

Bug Description

All of our puppet jobs recently seem to have filed on a Keystone connection error during init-keystone. I haven't been able to find anything in the logs indicate a problem with Keystone (/var/log/keystone looks clean, haproxy thinks it was up the whole time), but it appears to be happening consistently.

2015-06-12 14:20:10.899 | Traceback (most recent call last):
2015-06-12 14:20:10.901 | File "/opt/stack/new//tripleo-incubator/scripts/init-keystone", line 11, in <module>
2015-06-12 14:20:10.902 | sys.exit(main())
2015-06-12 14:20:10.903 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/os_cloud_config/cmd/init_keystone.py", line 88, in main
2015-06-12 14:20:10.940 | args.user, args.timeout, args.pollinterval, args.pkisetup)
2015-06-12 14:20:10.941 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/os_cloud_config/keystone.py", line 135, in initialize
2015-06-12 14:20:10.967 | _create_tenants(keystone_v2)
2015-06-12 14:20:10.967 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/os_cloud_config/keystone.py", line 440, in _create_tenants
2015-06-12 14:20:10.968 | _create_tenant(keystone, 'admin')
2015-06-12 14:20:10.969 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/os_cloud_config/keystone.py", line 215, in _create_tenant
2015-06-12 14:20:10.970 | tenants = keystone.tenants.findall(name=name)
2015-06-12 14:20:10.971 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/keystoneclient/base.py", line 263, in findall
2015-06-12 14:20:11.007 | for obj in self.list():
2015-06-12 14:20:11.008 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/keystoneclient/v2_0/tenants.py", line 123, in list
2015-06-12 14:20:11.010 | tenant_list = self._list('/tenants%s' % query, 'tenants')
2015-06-12 14:20:11.011 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/keystoneclient/base.py", line 113, in _list
2015-06-12 14:20:11.012 | resp, body = self.client.get(url, **kwargs)
2015-06-12 14:20:11.013 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/keystoneclient/adapter.py", line 170, in get
2015-06-12 14:20:11.015 | return self.request(url, 'GET', **kwargs)
2015-06-12 14:20:11.015 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/keystoneclient/adapter.py", line 206, in request
2015-06-12 14:20:11.016 | resp = super(LegacyJsonAdapter, self).request(*args, **kwargs)
2015-06-12 14:20:11.017 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/keystoneclient/adapter.py", line 95, in request
2015-06-12 14:20:11.018 | return self.session.request(url, method, **kwargs)
2015-06-12 14:20:11.018 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/keystoneclient/utils.py", line 336, in inner
2015-06-12 14:20:11.019 | return func(*args, **kwargs)
2015-06-12 14:20:11.020 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/keystoneclient/session.py", line 382, in request
2015-06-12 14:20:11.021 | resp = send(**kwargs)
2015-06-12 14:20:11.021 | File "/opt/stack/new/tripleo-incubator/openstack-tools/lib/python2.7/site-packages/keystoneclient/session.py", line 426, in _send_request
2015-06-12 14:20:11.032 | raise exceptions.ConnectionRefused(msg)
2015-06-12 14:20:11.034 | keystoneclient.openstack.common.apiclient.exceptions.ConnectionRefused: Unable to establish connection to http://192.0.2.3:35357/v2.0/tenants

Tags: ci
Revision history for this message
Dan Prince (dan-prince) wrote :

Going to try reverting a few puppet-keystone commits and see if that helps...

Changed in tripleo:
assignee: nobody → Dan Prince (dan-prince)
Revision history for this message
Dan Prince (dan-prince) wrote :

There were a couple suspicious puppet-keystone and puppet-openstacklib commits around keystone v3 functionality but nothing that would cause this.

Looking into keepalived now actually. It looks like the Keystone failure is just the red herring here.... what actually might be wrong is Fedora has a bad keepalived update.

summary: - CI failing on keystone connection error
+ CI failing on keystone connection error, bad keepalived update!
Revision history for this message
Dan Prince (dan-prince) wrote : Re: CI failing on keystone connection error, bad keepalived update!

The symptom I was seeing was keepalived was only providing access to one VIP on the br-ex device. So the "public" interface was just plain down (unpingable, etc.)

Tried using keepalived-1.2.16-1.fc21.x86_64.rpm locally and it fixed all the issues.

Also. Both the puppet and non-puppet versions of CI are failing here so an issue w/ keepalived makes sense.

Two patches in play:

DIB fix: https://review.openstack.org/#/c/191536/

TripleO CI cherry pick: https://review.openstack.org/#/c/191537/

Dan Prince (dan-prince)
summary: - CI failing on keystone connection error, bad keepalived update!
+ CI failing on keystone connection error, keepalived issue
Revision history for this message
Dan Prince (dan-prince) wrote :
Revision history for this message
Dan Prince (dan-prince) wrote :

The keepalived issue has been fixed in the distro. Closing.

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.