keystone error during controller install

Bug #1190043 reported by Chris Ricker
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cisco Openstack
Fix Released
High
Chris Ricker
Grizzly
Fix Released
High
Chris Ricker

Bug Description

I get this on the first puppet run on my controller when doing a fresh grizzly install

notice: /Service[keystone]: Triggered 'refresh' from 17 events
err: /Stage[main]/Keystone::Roles::Admin/Keystone_role[_member_]: Could not evaluate: Execution of '/usr/bin/keystone --endpoint http://127.0.0.1:35357/v2.0/ role-list' returned 1: <attribute 'message' of 'exceptions.BaseException' objects> (HTTP Unable to establish connection to http://127.0.0.1:35357/v2.0/OS-KSADM/roles)

after that run finishes, if I source openrc and manually run the command it works:

 keystone --endpoint http://127.0.0.1:35357/v2.0/ role-list
WARNING: Bypassing authentication using a token & endpoint (authentication credentials are being ignored).
+----------------------------------+----------+
| id | name |
+----------------------------------+----------+
| f7d5f3e82ba34b139feffd62bde018d3 | Member |
| 9fe2ff9ee4384b1894a90878d3e92bab | _member_ |
| c59022c87d59469080d7c825acf88dbf | admin |
+----------------------------------+----------+

If I run puppet a second time, I get more keystone errors:

info: /Stage[main]/Keystone/Exec[keystone-manage db_sync]: Scheduling refresh of Service[keystone]
notice: /Service[keystone]: Triggered 'refresh' from 1 events
err: /Stage[main]/Keystone::Roles::Admin/Keystone_role[_member_]: Could not evaluate: Execution of '/usr/bin/keystone --endpoint http://127.0.0.1:35357/v2.0/ role-list' returned 1: <attribute 'message' of 'exceptions.BaseException' objects> (HTTP Unable to establish connection to http://127.0.0.1:35357/v2.0/OS-KSADM/roles)
WARNING: Bypassing authentication using a token & endpoint (authentication credentials are being ignored).

err: /Stage[main]/Nova::Keystone::Auth/Keystone_user[nova]: Could not evaluate: Execution of '/usr/bin/keystone --os-auth-url http://127.0.0.1:35357/v2.0/ token-get' returned 1: Configuration error: Client configured to run without a service catalog. Run the client using --os-auth-url or OS_AUTH_URL, instead of --os-endpoint or OS_SERVICE_ENDPOINT, for example.
WARNING: Bypassing authentication using a token & endpoint (authentication credentials are being ignored).

err: /Stage[main]/Glance::Keystone::Auth/Keystone_user[glance]: Could not evaluate: Execution of '/usr/bin/keystone --os-auth-url http://127.0.0.1:35357/v2.0/ token-get' returned 1: Configuration error: Client configured to run without a service catalog. Run the client using --os-auth-url or OS_AUTH_URL, instead of --os-endpoint or OS_SERVICE_ENDPOINT, for example.
WARNING: Bypassing authentication using a token & endpoint (authentication credentials are being ignored).

(repeated for every service)

Revision history for this message
Chris Ricker (chris-ricker) wrote :

After multiple runs, this goes away

It appears to be a race between the keystone restart and the keystone *-list commands run immediately after the restart is issued

Revision history for this message
Chris Ricker (chris-ricker) wrote :

What I'm seeing currently is

first run:

err: /Stage[main]/Keystone::Roles::Admin/Keystone_role[_member_]: Could not evaluate: Execution of '/usr/bin/keystone --endpoint http://127.0.0.1:35357/v2.0/ role-list' returned 1: <attribute 'message' of 'exceptions.BaseException' objects> (HTTP Unable to establish connection to http://127.0.0.1:35357/v2.0/OS-KSADM/roles)

second run:

err: /Stage[main]/Keystone::Roles::Admin/Keystone_role[_member_]: Could not evaluate: Execution of '/usr/bin/keystone --endpoint http://127.0.0.1:35357/v2.0/ role-list' returned 1: <attribute 'message' of 'exceptions.BaseException' objects> (HTTP Unable to establish connection to http://127.0.0.1:35357/v2.0/OS-KSADM/roles)

third run:

all good

Revision history for this message
Dan Bode (bodepd) wrote :

what version of the keystone package is this with (is it the latest update)?

Can you recreate with the options --debug and --trace in puppet?

as far as conditions where the keystone service is not ready, the following code attempts to catch known exceptions and performs a 10 second sleep and a retry.

  https://github.com/stackforge/puppet-keystone/blob/master/lib/puppet/provider/keystone.rb#L77

I am really surprised that this is so reproducible, b/c tons of people use this code upstream, and there has not been a patch or a ticket submitted for this.

Revision history for this message
Dan Bode (bodepd) wrote :

on the first run, do other keystone resources get created successfully before that failure (indicating that the keystone server is up, but just not functioning correctly?)

also, can you show me the full output for the second run?

Revision history for this message
Chris Ricker (chris-ricker) wrote :
Revision history for this message
Chris Ricker (chris-ricker) wrote :
Revision history for this message
Chris Ricker (chris-ricker) wrote :
Revision history for this message
Chris Ricker (chris-ricker) wrote :

keystone package is 2013.1.3.a1.g4b22c02-25-cisco1

looking at them, logs are apparently not so useful as what logged in the syslog is different than what was on the screen... I'll try to recapture

Revision history for this message
Chris Ricker (chris-ricker) wrote :
Download full text (3.9 KiB)

Here's the relevant from the first run. It's that the first keystone query after the keystone server gets restarted fails with a connection refused, but subsequent queries work:

notice: /Stage[main]/Keystone/Keystone_config[DEFAULT/verbose]/ensure: created
info: /Stage[main]/Keystone/Keystone_config[DEFAULT/verbose]: Scheduling refresh of Service[keystone]
debug: /Stage[main]/Keystone/Keystone_config[DEFAULT/verbose]: The container Class[Keystone] will propagate my refresh event
info: /Stage[main]/Keystone/Keystone_config[DEFAULT/verbose]: Scheduling refresh of Exec[keystone-manage db_sync]
debug: Exec[keystone-manage db_sync](provider=posix): Executing 'keystone-manage db_sync'
debug: Executing 'keystone-manage db_sync'
notice: /Stage[main]/Keystone/Exec[keystone-manage db_sync]: Triggered 'refresh' from 12 events
info: /Stage[main]/Keystone/Exec[keystone-manage db_sync]: Scheduling refresh of Service[keystone]
debug: /Stage[main]/Keystone/Exec[keystone-manage db_sync]: The container Class[Keystone] will propagate my refresh event
debug: Puppet::Type::Service::ProviderUpstart: Executing '/sbin/status keystone'
debug: Puppet::Type::Service::ProviderUpstart: Executing '/sbin/status keystone'
debug: Service[keystone](provider=upstart): Executing '/sbin/restart keystone'
notice: /Service[keystone]: Triggered 'refresh' from 17 events
debug: /Service[keystone]: The container Class[Keystone] will propagate my refresh event
debug: Prefetching keystone resources for keystone_role
debug: Puppet::Type::Keystone_role::ProviderKeystone: Executing '/usr/bin/keystone --endpoint http://127.0.0.1:35357/v2.0/ role-list'
err: /Stage[main]/Keystone::Roles::Admin/Keystone_role[_member_]: Could not evaluate: Execution of '/usr/bin/keystone --endpoint http://127.0.0.1:35357/v2.0/ role-list' returned 1: <attribute 'message' of 'exceptions.BaseException' objects> (HTTP Unable to establish connection to http://127.0.0.1:35357/v2.0/OS-KSADM/roles)

debug: Class[Keystone]: The container Stage[main] will propagate my refresh event
debug: Prefetching keystone resources for keystone_tenant
debug: #<Class:0x7f6466665680>: Executing '/usr/bin/keystone --endpoint http://127.0.0.1:35357/v2.0/ tenant-list'
debug: #<Class:0x7f6466665680>: Executing '/usr/bin/keystone --endpoint http://127.0.0.1:35357/v2.0/ tenant-create --name admin --enabled True --description admin tenant'
notice: /Stage[main]/Keystone::Roles::Admin/Keystone_tenant[admin]/ensure: created
debug: /Stage[main]/Keystone::Roles::Admin/Keystone_tenant[admin]: The container Class[Keystone::Roles::Admin] will propagate my refresh event
debug: Puppet::Type::Keystone_role::ProviderKeystone: Executing '/usr/bin/keystone --endpoint http://127.0.0.1:35357/v2.0/ role-list'
debug: Puppet::Type::Keystone_role::ProviderKeystone: Executing '/usr/bin/keystone --endpoint http://127.0.0.1:35357/v2.0/ role-create --name Member'
notice: /Stage[main]/Keystone::Roles::Admin/Keystone_role[Member]/ensure: created
debug: /Stage[main]/Keystone::Roles::Admin/Keystone_role[Member]: The container Class[Keystone::Roles::Admin] will propagate my refresh event
debug: Prefetching keystone resources for keystone_service
debug: Puppet::Type::Key...

Read more...

Revision history for this message
Chris Ricker (chris-ricker) wrote :
Revision history for this message
Chris Ricker (chris-ricker) wrote :
Revision history for this message
Chris Ricker (chris-ricker) wrote :
Revision history for this message
Chris Ricker (chris-ricker) wrote :

the console logs (just attached) show the error the first two times after the service restarts, but run without error the third time

Changed in openstack-cisco:
importance: Undecided → High
status: New → Triaged
assignee: nobody → Chris Ricker (chris-ricker)
Revision history for this message
Dan Bode (bodepd) wrote :

I had a quick look at the logs:

1. the fact that it ran the second time is kind of misleading, it failed b/c the keystone service was restarted on the second run. Ideally, puppet should do nothing on the second run, but for some reason, it is still configuring things, which leads to the keystone service restart.

2. This command seems to fail b/c it is the first command that runs after the keystone service restart.

3. A fix would be easy enough:

The following code is supposed to handle the case where keystone is not ready:

https://github.com/stackforge/puppet-keystone/blob/master/lib/puppet/provider/keystone.rb#L77

We have identified two cases which indicate the server is not ready:

   (e.message =~ /\[Errno 111\] Connection refused/) or (e.message =~ /\(HTTP 400\)/)

It seems that we are running into a third case now, this new message:

    HTTP Unable to establish connection to http://127.0.0.1:35357/v2.0/OS-KSADM/roles

so, you can fix it by adding:

  (e.message =~ /\[Errno 111\] Connection refused/) or (e.message =~ /\(HTTP 400\)/) or
  (e.message => /HTTP Unable to establish connection/)

Even if this resolves the issue, it raises a bigger question of why are other folks not seeing the same issue with the upstream modules and package repos. Is there any easy way to see the diffs between our packages and the upstream packages (sorry if it's an easy question, I still have not wrapped my head around our packages)

Revision history for this message
Chris Ricker (chris-ricker) wrote :

I've pushed https://review.openstack.org/33268 for review upstream

In testing here it resolves the issue

I suspect the reason others haven't reported is perhaps because we're using a newish keystone client (0.2.5, about 2 weeks old). The 'Connection refused' appears to be what older clients returned

Revision history for this message
Chris Ricker (chris-ricker) wrote :

Commit dca30fbb936dfaca7f5f10ab34a67e6460b3b235 (fix for Bug #1163678) appears to be the relevant change client-side

Revision history for this message
Dan Bode (bodepd) wrote :

why are we using a newer version of the keystone client then upstream? I did some digging, and this new error message shows up in version 0.2.4.

The current upstream packages for

 Grizzly

    http://reqorts.qa.ubuntu.com/reports/ubuntu-server/cloud-archive/grizzly_versions.html

as well as

 Havana

        http://reqorts.qa.ubuntu.com/reports/ubuntu-server/cloud-archive/havana_versions.html

are both using 0.2.3. Is there a specific reason we are using 0.2.4?

Revision history for this message
Chris Ricker (chris-ricker) wrote :

It was updated to get fixes for security advisories

Changed in openstack-cisco:
status: Triaged → Fix Committed
Revision history for this message
Mark T. Voelker (mvoelker) wrote :

@bodepd: Namely OSSA 2013-014 if my memory is correct.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.