Unable to restart c-vol services when using multi-backend due to set_voldb_empty_at_startup_indicator

Bug #1424104 reported by John Griffith
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Cinder
New
Undecided
Michal Dulko

Bug Description

When attempting to restart c-vol services in setups that have multi-backend enabled we encounter an unhandled exception from the DB causing the service to die:

2015-02-20 23:10:10.924 10176 CRITICAL cinder [req-c867de6c-f15f-4fc4-beb8-80e0c6e14be0 - - - - -] DBError: This result object does not return rows. It has been closed automatically.
2015-02-20 23:10:10.924 10176 TRACE cinder Traceback (most recent call last):
2015-02-20 23:10:10.924 10176 TRACE cinder File "/usr/local/bin/cinder-volume", line 9, in <module>
2015-02-20 23:10:10.924 10176 TRACE cinder load_entry_point('cinder==2015.1.dev103', 'console_scripts', 'cinder-volume')()
2015-02-20 23:10:10.924 10176 TRACE cinder File "/opt/stack/cinder/cinder/cmd/volume.py", line 69, in main
2015-02-20 23:10:10.924 10176 TRACE cinder binary='cinder-volume')
2015-02-20 23:10:10.924 10176 TRACE cinder File "/opt/stack/cinder/cinder/service.py", line 243, in create
2015-02-20 23:10:10.924 10176 TRACE cinder service_name=service_name)
2015-02-20 23:10:10.924 10176 TRACE cinder File "/opt/stack/cinder/cinder/service.py", line 127, in __init__
2015-02-20 23:10:10.924 10176 TRACE cinder *args, **kwargs)
2015-02-20 23:10:10.924 10176 TRACE cinder File "/opt/stack/cinder/cinder/volume/manager.py", line 188, in __init__
2015-02-20 23:10:10.924 10176 TRACE cinder context.get_admin_context())
2015-02-20 23:10:10.924 10176 TRACE cinder File "/opt/stack/cinder/cinder/volume/manager.py", line 262, in _set_voldb_empty_at_startup_indicator
2015-02-20 23:10:10.924 10176 TRACE cinder None, filters=None)
2015-02-20 23:10:10.924 10176 TRACE cinder File "/opt/stack/cinder/cinder/db/api.py", line 186, in volume_get_all
2015-02-20 23:10:10.924 10176 TRACE cinder filters=filters)
2015-02-20 23:10:10.924 10176 TRACE cinder File "/opt/stack/cinder/cinder/db/sqlalchemy/api.py", line 160, in wrapper
2015-02-20 23:10:10.924 10176 TRACE cinder return f(*args, **kwargs)
2015-02-20 23:10:10.924 10176 TRACE cinder File "/opt/stack/cinder/cinder/db/sqlalchemy/api.py", line 1211, in volume_get_all
2015-02-20 23:10:10.924 10176 TRACE cinder return query.all()
2015-02-20 23:10:10.924 10176 TRACE cinder File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2320, in all
2015-02-20 23:10:10.924 10176 TRACE cinder return list(self)
2015-02-20 23:10:10.924 10176 TRACE cinder File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/loading.py", line 66, in instances
2015-02-20 23:10:10.924 10176 TRACE cinder fetch = cursor.fetchall()
2015-02-20 23:10:10.924 10176 TRACE cinder File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 787, in fetchall
2015-02-20 23:10:10.924 10176 TRACE cinder self.cursor, self.context)
2015-02-20 23:10:10.924 10176 TRACE cinder File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1155, in _handle_dbapi_exception
2015-02-20 23:10:10.924 10176 TRACE cinder util.raise_from_cause(newraise, exc_info)
2015-02-20 23:10:10.924 10176 TRACE cinder File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
2015-02-20 23:10:10.924 10176 TRACE cinder reraise(type(exception), exception, tb=exc_tb)
2015-02-20 23:10:10.924 10176 TRACE cinder File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 781, in fetchall
2015-02-20 23:10:10.924 10176 TRACE cinder l = self.process_rows(self._fetchall_impl())
2015-02-20 23:10:10.924 10176 TRACE cinder File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 750, in _fetchall_impl
2015-02-20 23:10:10.924 10176 TRACE cinder self._non_result()
2015-02-20 23:10:10.924 10176 TRACE cinder File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 755, in _non_result
2015-02-20 23:10:10.924 10176 TRACE cinder "This result object does not return rows. "
2015-02-20 23:10:10.924 10176 TRACE cinder DBError: This result object does not return rows. It has been closed automatically.
2015-02-20 23:10:10.924 10176 TRACE cinder

It appears that this was introduced by the NFS empty startup check made here:
    Commit: 6879bd0720b2c4c5ef4d2f2c42fe0e4e436ba998

It looks like there's a timing issue where the startup of the second backend makes the call out to the DB? Not sure, but it also seems like some systems I hit it, others I don't and I'm not clear on what the actual cause is or how to reliably reproduce it.

We should see if we can figure out what's going on here, if nothing else maybe wrap the call in a try/catch and I guess return False on exception (or wait and retry seems to work well too).

Revision history for this message
John Griffith (john-griffith) wrote :

To clarify, I suspect this is related to how we do multi-backend, we have two threads and I believe they are sharing the same DB session, so this becomes a contention issue; I'm not sure but I think that it might be something along those lines.

Changed in cinder:
assignee: nobody → Michal Dulko (michal-dulko-f)
Revision history for this message
Michal Dulko (michal-dulko-f) wrote :

I was able to reproduce this before https://review.openstack.org/#/c/163551 got merged. Errors like that were mixed with those like in https://bugs.launchpad.net/cinder/+bug/1417018. I assume these bugs were related and this is fixed now.

Revision history for this message
Tom Swanson (tom-swanson) wrote :

For me this went away after https://review.openstack.org/#/c/163551 so I'm assuming this is fixed.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.