DBDeadlock exception in sql backend

Bug #1257908 reported by gordon chung
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Ceilometer
Fix Released
Medium
gordon chung

Bug Description

i see this occasionally.

with a new database, starting devstack with (pipeline polling at 20sec):

/opt/stack/ceilometer/ceilometer/openstack/common/db/sqlalchemy/session.py:521: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
  m = re.match(operational_error.message)
2013-12-04 15:51:44.522 1198 ERROR ceilometer.dispatcher.database [req-6c0f023d-abde-4c31-9ae9-d951a24930f9 admin None] Failed to record metering data: (OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') 'UPDATE resource SET resource_metadata=%s, user_id=%s, project_id=%s WHERE resource.id = %s' (None, None, '903b138b3ae54e6794e51568ef7ae1ad', '903b138b3ae54e6794e51568ef7ae1ad')
2013-12-04 15:51:44.522 1198 TRACE ceilometer.dispatcher.database Traceback (most recent call last):
2013-12-04 15:51:44.522 1198 TRACE ceilometer.dispatcher.database File "/opt/stack/ceilometer/ceilometer/dispatcher/database.py", line 67, in record_metering_data
2013-12-04 15:51:44.522 1198 TRACE ceilometer.dispatcher.database self.storage_conn.record_metering_data(meter)
2013-12-04 15:51:44.522 1198 TRACE ceilometer.dispatcher.database File "/opt/stack/ceilometer/ceilometer/storage/impl_sqlalchemy.py", line 259, in record_metering_data
2013-12-04 15:51:44.522 1198 TRACE ceilometer.dispatcher.database session.flush()
2013-12-04 15:51:44.522 1198 TRACE ceilometer.dispatcher.database File "/opt/stack/ceilometer/ceilometer/openstack/common/db/sqlalchemy/session.py", line 538, in _wrap
2013-12-04 15:51:44.522 1198 TRACE ceilometer.dispatcher.database _raise_if_deadlock_error(e, get_engine().name)
2013-12-04 15:51:44.522 1198 TRACE ceilometer.dispatcher.database File "/opt/stack/ceilometer/ceilometer/openstack/common/db/sqlalchemy/session.py", line 524, in _raise_if_deadlock_error
2013-12-04 15:51:44.522 1198 TRACE ceilometer.dispatcher.database raise exception.DBDeadlock(operational_error)
2013-12-04 15:51:44.522 1198 TRACE ceilometer.dispatcher.database DBDeadlock: (OperationalError) (1213, 'Deadlock found when trying to get lock; try restarting transaction') 'UPDATE resource SET resource_metadata=%s, user_id=%s, project_id=%s WHERE resource.id = %s' (None, None, '903b138b3ae54e6794e51568ef7ae1ad', '903b138b3ae54e6794e51568ef7ae1ad')
2013-12-04 15:51:44.522 1198 TRACE ceilometer.dispatcher.database
/opt/stack/ceilometer/ceilometer/openstack/common/db/sqlalchemy/session.py:484: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
  m = _DUP_KEY_RE_DB[engine_name].match(integrity_error.message)
/opt/stack/ceilometer/ceilometer/openstack/common/db/sqlalchemy/session.py:484: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
  m = _DUP_KEY_RE_DB[engine_name].match(integrity_error.message)

gordon chung (chungg)
description: updated
Revision history for this message
gordon chung (chungg) wrote :

related to _create_or_update code. while the creation of value is concurrent safe, the update part of code allows for multiple updates against source, and other kwargs fields.... it begs the question, should we really even have source value in user/project tables and source/user/project/metadata values in resource table when they are redundant info stored in Meter record

Revision history for this message
gordon chung (chungg) wrote :
gordon chung (chungg)
Changed in ceilometer:
assignee: nobody → gordon chung (chungg)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ceilometer (master)

Fix proposed to branch: master
Review: https://review.openstack.org/72414

Changed in ceilometer:
status: New → In Progress
gordon chung (chungg)
Changed in ceilometer:
importance: Undecided → Medium
Revision history for this message
Mitsuru Kanabuchi (kanabuchi) wrote :

For reference, we found the bug that seems same cause.

[ Issue ]

When we ran the ceilometer-expirer and following DB Erroro has occurred.

2014-03-03 18:06:36.734 31990 CRITICAL ceilometer [-] (IntegrityError) (1451, 'Cannot delete or update a parent row: a foreign key constraint fails (`ceilometer`.`meter`, CONSTRAINT `fk_meter_project_id` FOREIGN KEY (`project_id`) REFERENCES `project` (`id`))') 'DELETE FROM project WHERE project.id = %s' ('e6e689cd30e049c1a0f66b21d6c183e8',)
2014-03-03 18:06:36.734 31990 TRACE ceilometer Traceback (most recent call last):
2014-03-03 18:06:36.734 31990 TRACE ceilometer File "/usr/local/bin/ceilometer-expirer", line 10, in <module>
2014-03-03 18:06:36.734 31990 TRACE ceilometer sys.exit(expirer())
2014-03-03 18:06:36.734 31990 TRACE ceilometer File "/usr/local/lib/python2.7/dist-packages/ceilometer/storage/__init__.py", line 167, in expirer
2014-03-03 18:06:36.734 31990 TRACE ceilometer cfg.CONF.database.time_to_live)
2014-03-03 18:06:36.734 31990 TRACE ceilometer File "/usr/local/lib/python2.7/dist-packages/ceilometer/storage/impl_sqlalchemy.py", line 344, in clear_expired_metering_data
2014-03-03 18:06:36.734 31990 TRACE ceilometer for res_obj in query.all():
2014-03-03 18:06:36.734 31990 TRACE ceilometer File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2115, in all
2014-03-03 18:06:36.734 31990 TRACE ceilometer return list(self)
2014-03-03 18:06:36.734 31990 TRACE ceilometer File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2226, in __iter__
2014-03-03 18:06:36.734 31990 TRACE ceilometer self.session._autoflush()
2014-03-03 18:06:36.734 31990 TRACE ceilometer File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1127, in _autoflush
2014-03-03 18:06:36.734 31990 TRACE ceilometer self.flush()
2014-03-03 18:06:36.734 31990 TRACE ceilometer File "/usr/local/lib/python2.7/dist-packages/ceilometer/openstack/common/db/sqlalchemy/session.py", line 594, in _wrap
2014-03-03 18:06:36.734 31990 TRACE ceilometer raise exception.DBError(e)
2014-03-03 18:06:36.734 31990 TRACE ceilometer DBError: (IntegrityError) (1451, 'Cannot delete or update a parent row: a foreign key constraint fails (`ceilometer`.`meter`, CONSTRAINT `fk_meter_project_id` FOREIGN KEY (`project_id`) REFERENCES `project` (`id`))') 'DELETE FROM project WHERE project.id = %s' ('e6e689cd30e049c1a0f66b21d6c183e8',)
2014-03-03 18:06:36.734 31990 TRACE ceilometer

[ Reproduce ]

1. ceilometer-expirer TTL is set to 10sec. (for generating the error, use short TTL)
2. Register 300 images in Glance.
3. CentralAgent is running and 600 cases registered in meter (result meter)
4. When the next CentralAgent's polling starts, run the ceilometer-expirer

[ Note ]

I think, Ceilometer isn't handling a conflict processing appropriate.

I saw review https://review.openstack.org/#/c/72414/ .
So I have understood, you have the plan of changing data model.
I think it's really important.
Could you share the progress of changing data model?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/80461

Revision history for this message
gordon chung (chungg) wrote :

Mitsuru, it appears that bug you mentioned is different from this case but very real (i verified it today)

here's the bp i opened regarding improving the sql backend. https://blueprints.launchpad.net/ceilometer/+spec/big-data-sql

basically the grand scheme is to get rid of any places we are updating the model because they will obviously be a bottleneck in high load.

please comment on the bp if you have any suggestions. i'd be interested to differing/agreeing opinions. :)

Revision history for this message
Mitsuru Kanabuchi (kanabuchi) wrote :

Hi Gordon, thank you for registering new blueprint.

We want to use MySQL backend with reasonable performance.
Your blueprint have really match our purpose.

We'll comment to the blueprint when we thought up the idea for performance improvement.

Julien Danjou (jdanjou)
Changed in ceilometer:
milestone: none → icehouse-rc1
gordon chung (chungg)
Changed in ceilometer:
milestone: icehouse-rc1 → next
gordon chung (chungg)
Changed in ceilometer:
milestone: next → none
gordon chung (chungg)
Changed in ceilometer:
milestone: none → juno-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/94483

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ceilometer (master)

Reviewed: https://review.openstack.org/94483
Committed: https://git.openstack.org/cgit/openstack/ceilometer/commit/?id=ac26db7cfa42f751be63a406056f4ed3e828f56d
Submitter: Jenkins
Branch: master

commit ac26db7cfa42f751be63a406056f4ed3e828f56d
Author: Gordon Chung <email address hidden>
Date: Tue May 20 17:48:46 2014 -0400

    refactor sql backend to improve write speed

    - drop sourceassoc table as its data is not accessible via api
    - drop resource table since data can be retreive from sample table

    Change-Id: I2d4a5175734cafce6a439ad736c47691e6e7e847
    Implements: blueprint big-data-sql
    Closes-Bug: #1305332
    Closes-Bug: #1257908

Changed in ceilometer:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in ceilometer:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in ceilometer:
milestone: juno-1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.