OpenStack Heat

Undeletable stack due to DB event integrity error

Bug #1681772 reported by Steven Hardy on 2017-04-11

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Heat	Fix Released	Medium	Crag Wolfe	OpenStack Heat pike-2

Bug Description

I'm seeing this error:

-AllNodesDeploySteps-v4mqt2qfkhfa): Resource DELETE failed: DBReferenceError: resources.CephStorageGenerateConfigDeployment: (pymysql.err.IntegrityError) (1451, u'Cannot delete or update a parent row: a foreign key constraint fails (`heat`.`event`, CONSTRAINT `ev_rsrc_prop_data_ref` FOREIGN KEY (`rsrc_prop_data_id`) REFERENCES `resource_properties_data` (`id`))') [SQL: u'DELETE FROM resource_properties_data WHERE resource_properties_data.id IN (%(id_1)s, %(id_2)s, %(id_3)s, %(id_4)s, %(id_5)s, %(id_6)s, %(id_7)s, %(id_8)s, %(id_9)s, %(id_10)s, %(id_11)s)'] [parameters: {u'id_2': 1956, u'id_11': 1977, u'id_10': 1913, u'id_3': 1957, u'id_1': 1955, u'id_6': 1970, u'id_7': 1972, u'id_4': 1959, u'id_5': 1969, u'id_8': 1973, u'id_9': 1974}]

I did create a stack then update to a newer version, but the db_sync is up to date AFAICT:

(undercloud) [stack@undercloud ~]$ sudo heat-manage db_version
2017-04-11 10:33:03.066 7587 WARNING oslo_config.cfg [-] Option "db_backend" from group "DEFAULT" is deprecated. Use option "backend" from group "database".
80
(undercloud) [stack@undercloud ~]$ sudo heat-manage db_sync
2017-04-11 10:33:10.496 7595 WARNING oslo_config.cfg [-] Option "db_backend" from group "DEFAULT" is deprecated. Use option "backend" from group "database".
(undercloud) [stack@undercloud ~]$ sudo heat-manage db_version
2017-04-11 10:33:12.960 7607 WARNING oslo_config.cfg [-] Option "db_backend" from group "DEFAULT" is deprecated. Use option "backend" from group "database".
80

I don't see any recent related migrations or changes to the DB model, but I wanted to raise this to see if other folks are seeing similar. I don't yet have a minimal reproducer.

Revision history for this message

Steven Hardy (shardy) wrote on 2017-04-11:

Hmm, so it seems that sudo heat-manage migrate_properties_data resolves this, so perhaps I missed some recent change - I don't see a pike release note saying that's mandatory?

Revision history for this message

Crag Wolfe (cwolfe) wrote on 2017-04-11:

It shouldn't be mandatory. It isn't yet clear to me why this error is occurring.

Revision history for this message

Crag Wolfe (cwolfe) wrote on 2017-04-12:

I suspect not seeing the issue after heat-manage migrate_properties_data is luck. The error reported is related to the new-style properties data which exist in the resource_properties_data table. Since Ocata, all new properties data that are created exist only in the resource_properties_data table, though we can still read older pre-Ocata properties data in the legacy resource.properties_data column. Migrating just moves the old properties data.

There are only two places where we have delete in clauses on resource_properties_data. One is when a stack is purged, which I assume is not the scenario here as this just seems to be a stack delete. The other is when event pruning is triggered. Assuming the latter is the issue here (entirely possible since every DELETE_IN_PROGRESS and DELETE_COMPLETE results in a new event that can trigger event pruning), the only thing I can think of is s/synchronize_session=False/synchronize_session=True/ here:
https://git.openstack.org/cgit/openstack/heat/tree/heat/db/sqlalchemy/api.py?id=157ede194#n921

On a related note, there are two parameters that are going to have an effect on how often this purge takes place and how many events are purged. The behaviour of these parameters changed a bit with : https://review.openstack.org/#/c/400388/
Note the default for event_purge_batch_size changed to 200 from 10 in that commit.

In general, I'd push for higher numbers for max_events_per_stack and event_purge_batch_size to lessen the number of event purge operations, and event row counting we do during resource actions. That said, this error shouldn't be occurring regardless of the two config values.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-04-25: Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/459780

Changed in heat:
assignee:	nobody → Crag Wolfe (cwolfe)
status:	New → In Progress

Thomas Herve (therve) on 2017-05-04

Changed in heat:
milestone:	none → pike-2
importance:	Undecided → Medium

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-05-04: Fix merged to heat (master)

Reviewed: https://review.openstack.org/459780
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=a6f4c6d2c62d1a545c5f6e7c75e3af1b45724f51
Submitter: Jenkins
Branch: master

commit a6f4c6d2c62d1a545c5f6e7c75e3af1b45724f51
Author: Crag Wolfe <email address hidden>
Date: Tue Apr 25 09:18:25 2017 -0700

Low-level db delete of events should be synchronous

    Previously, synchronized_session=False was used in the call to prune
    events. This was overly aggressive since it was possible (if rare)
    that, during next db operation to delete resource_properties_data
    rows, some of the referenced events could still have existed resulting
    in a db referential integrity error.

Change-Id: I5c4cf6a162ff853f84d68e7b203ffa1aae684359
Closes-Bug: #1681772

Changed in heat:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-06-07: Fix included in openstack/heat 9.0.0.0b2

This issue was fixed in the openstack/heat 9.0.0.0b2 development milestone.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.