oslo.db

CLI will fail one time after restarting DB

Bug #1389985 reported by Song Li on 2014-11-06

This bug report is a duplicate of: Bug #1374497: change in oslo.db "ping" handling is causing issues in projects that are not using transactions. Edit Remove

This bug affects 1 person

	Status	Importance	Assigned to	Milestone
Ceilometer	Fix Committed	Undecided	Lan Qi song
Juno	Fix Committed	Undecided	Unassigned	Ceilometer 2014.2.4
Glance	In Progress	Undecided	Louis Taylor
OpenStack Compute (nova)	Incomplete	Undecided	Unassigned
OpenStack Identity (keystone)	Incomplete	Undecided	Unassigned
oslo.db	New	Undecided	Unassigned

Bug Description

After restarting database, the first command will fail. for example:
after restarting Database, and wait for a few minutes.
Then run heat stack-list, result will be like below:

ERROR: Remote error: DBConnectionError (OperationalError) ibm_db_dbi::OperationalError: SQLNumResultCols failed: [IBM][CLI Driver] SQL30081N A communication error has been detected. Communication protocol being used: "TCP/IP". Communication API being used: "SOCKETS". Location where the error was detected: "10.11.1.14". Communication function detecting the error: "send". Protocol specific error code(s): "2", "*", "*". SQLSTATE=08001 SQLCODE=-30081 'SELECT stack.status_reason AS stack_status_reason, stack.created_at AS stack_created_at, stack.deleted_at AS stack_deleted_at, stack.action AS stack_action, stack.status AS stack_status, stack.id AS stack_id, stack.name AS stack_name, stack.raw_template_id AS stack_raw_template_id, stack.username AS stack_username, stack.tenant AS stack_tenant, stack.parameters AS stack_parameters, stack.user_creds_id AS stack_user_creds_id, stack.owner_id AS stack_owner_id, stack.timeout AS stack_timeout, stack.disable_rollback AS stack_disable_rollback, stack.stack_user_project_id AS stack_stack_user_project_id, stack.backup AS stack_backup, stack.updated_at AS stack_updated_at \nFROM stack \nWHERE stack.deleted_at IS NULL AND stack.owner_id IS NULL AND stack.tenant = ? ORDER BY stack.created_at DESC, stack.id DESC' ('a3a14c6f82bd4ce88273822407a0829b',)
[u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply\n incoming.message))\n', u' File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch\n return self._do_dispatch(endpoint, method, ctxt, args)\n', u' File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch\n result = getattr(endpoint, method)(ctxt, **new_args)\n', u' File "/usr/lib/python2.6/site-packages/heat/engine/service.py", line 69, in wrapped\n return func(self, ctx, *args, **kwargs)\n', u' File "/usr/lib/python2.6/site-packages/heat/engine/service.py", line 490, in list_stacks\n return [api.format_stack(stack) for stack in stacks]\n', u' File "/usr/lib/python2.6/site-packages/heat/engine/stack.py", line 264, in load_all\n show_deleted, show_nested) or []\n', u' File "/usr/lib/python2.6/site-packages/heat/db/api.py", line 130, in stack_get_all\n show_deleted, show_nested)\n', u' File "/usr/lib/python2.6/site-packages/heat/db/sqlalchemy/api.py", line 368, in stack_get_all\n marker, sort_dir, filters).all()\n', u' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py", line 2241, in all\n return list(self)\n', u' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py", line 2353, in __iter__\n return self._execute_and_instances(context)\n', u' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py", line 2368, in _execute_and_instances\n result = conn.execute(querycontext.statement, self._params)\n', u' File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 662, in execute\n params)\n', u' File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 761, in _execute_clauseelement\n compiled_sql, distilled_params\n', u' File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 874, in _execute_context\n context)\n', u' File "/usr/lib/python2.6/site-packages/oslo/db/sqlalchemy/compat/handle_error.py", line 125, in _handle_dbapi_exception\n six.reraise(type(newraise), newraise, sys.exc_info()[2])\n', u' File "/usr/lib/python2.6/site-packages/oslo/db/sqlalchemy/compat/handle_error.py", line 102, in _handle_dbapi_exception\n per_fn = fn(ctx)\n', u' File "/usr/lib/python2.6/site-packages/oslo/db/sqlalchemy/exc_filters.py", line 323, in handler\n context.is_disconnect)\n', u' File "/usr/lib/python2.6/site-packages/oslo/db/sqlalchemy/exc_filters.py", line 263, in _is_db_connection_error\n raise exception.DBConnectionError(operational_error)\n', u'DBConnectionError: (OperationalError) ibm_db_dbi::OperationalError: SQLNumResultCols failed: [IBM][CLI Driver] SQL30081N A communication error has been detected. Communication protocol being used: "TCP/IP". Communication API being used: "SOCKETS". Location where the error was detected: "10.11.1.14". Communication function detecting the error: "send". Protocol specific error code(s): "2", "*", "*". SQLSTATE=08001 SQLCODE=-30081 \'SELECT stack.status_reason AS stack_status_reason, stack.created_at AS stack_created_at, stack.deleted_at AS stack_deleted_at, stack.action AS stack_action, stack.status AS stack_status, stack.id AS stack_id, stack.name AS stack_name, stack.raw_template_id AS stack_raw_template_id, stack.username AS stack_username, stack.tenant AS stack_tenant, stack.parameters AS stack_parameters, stack.user_creds_id AS stack_user_creds_id, stack.owner_id AS stack_owner_id, stack.timeout AS stack_timeout, stack.disable_rollback AS stack_disable_rollback, stack.stack_user_project_id AS stack_stack_user_project_id, stack.backup AS stack_backup, stack.updated_at AS stack_updated_at \\nFROM stack \\nWHERE stack.deleted_at IS NULL AND stack.owner_id IS NULL AND stack.tenant = ? ORDER BY stack.created_at DESC, stack.id DESC\' (\'a3a14c6f82bd4ce88273822407a0829b\',)\n'].

then run heat stack-list or other command, the command will be ok.

Tags:

Revision history for this message

Clint Byrum (clint-fewbar) wrote on 2014-11-12:

Seems like this is an oslo.db issue, not specific to any of its consumers.

Revision history for this message

Song Li (lisong-cruise) wrote on 2014-11-12:

yes, thanks Clint, I will try to make sure whether it is oslo.db issue as soon as possible, then I will move the issue to oslo. Thanks

Revision history for this message

Ai Jie Niu (niuaj) wrote on 2014-11-12:

hi, Client, yes, I think when oslo.db lost the connection to DB2, it can not reconnect to it automatically, but report a error at the first time it found the connection lost

Revision history for this message

Louis Taylor (kragniz) wrote on 2014-11-12:

There is a patch under review to fix this in glance: https://review.openstack.org/#/c/122114/

This could be the wrong approach to fixing the problem, but it just extends the current method of handling deadlocks to also deal with connection errors.

Changed in glance:
status:	New → Confirmed
assignee:	nobody → Louis Taylor (kragniz)
status:	Confirmed → In Progress

Song Li (lisong-cruise) on 2014-11-14

Changed in heat:
assignee:	nobody → Song Li (lisong-cruise)

Revision history for this message

Morgan Fainberg (mdrnstm) wrote on 2014-11-14:

Wasn't this issue already addressed in oslo.db? This looks an awful lot like https://bugs.launchpad.net/keystone/+bug/1374497

Changed in keystone:
status:	New → Incomplete

Joe Gordon (jogo) on 2014-11-14

Changed in nova:
status:	New → Incomplete

Revision history for this message

Song Li (lisong-cruise) wrote on 2014-11-17:

@Morgan Fainberg

Thanks you very much for your reminder, I have looked into the issue:
https://bugs.launchpad.net/keystone/+bug/1374497

They are really very similar, I will make sure that with the Owner and then duplicate our issue to the 1374497.

Thanks again :)

Revision history for this message

Song Li (lisong-cruise) wrote on 2014-11-17:

I have tried the patch in https://bugs.launchpad.net/keystone/+bug/1374497
It can resolve our issue. thanks:)

Steve Baker (steve-stevebaker) on 2014-11-17

no longer affects:

heat

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-11-18: Fix proposed to ceilometer (master)

Fix proposed to branch: master
Review: https://review.openstack.org/135186

Changed in ceilometer:
assignee:	nobody → Lan Qi song (lqslan)
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-12-08: Fix merged to ceilometer (master)

Reviewed: https://review.openstack.org/135186
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=8c6841d3c00931204eaba0e9058707629120c1da
Submitter: Jenkins
Branch: master

commit 8c6841d3c00931204eaba0e9058707629120c1da
Author: lqslan <email address hidden>
Date: Tue Nov 18 15:38:51 2014 +0800

Retry to connect database when DB2 or mongodb is restarted

    The patch https://review.openstack.org/#/c/122387 works fine
    with operations with get, record and update functions.
    But exception would still occured with the operation of
    db.collection.find() function.

    This patch can give some benefit to tolerate DB restart
    with find() function.
    This patch also removes "test_mongo_find" test case since
    it doesn't raise AutoReconnect exception at all.

Change-Id: Ia0474726960ce2b4b611fda0a1c304bb8ad96922
Closes-Bug: #1389985

Changed in ceilometer:
status:	In Progress → Fix Committed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-12-09: Fix proposed to ceilometer (stable/juno)

#10

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/140223

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-06-12: Fix merged to ceilometer (stable/juno)

#11

Reviewed: https://review.openstack.org/140223
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=529b4aaf50d34881ce0869b74688200adf462ea0
Submitter: Jenkins
Branch: stable/juno

commit 529b4aaf50d34881ce0869b74688200adf462ea0
Author: lqslan <email address hidden>
Date: Tue Nov 18 15:38:51 2014 +0800

Retry to connect database when DB2 or mongodb is restarted

    This patch can give some benefit to tolerate DB restart
    with find() function.
    This patch also removes "test_mongo_find" test case since
    it doesn't raise AutoReconnect exception at all.

    Closes-Bug: #1389985
    Change-Id: Ia0474726960ce2b4b611fda0a1c304bb8ad96922
    (cherry-picked from commit 8c6841d3c00931204eaba0e9058707629120c1da)

tags:

added: in-stable-juno

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1374497 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.