cinder-manage db purge fails with foreign key constraint violation

Bug #1671354 reported by Christoph Fiehe
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
TommyLike

Bug Description

The command "cinder-manage db purge age_in_days 1" fails in the current Ocata release with foreign key constraint violation:
"Purge command failed, check cinder-manage logs for more details."

cinder-manage.log:
2017-03-09 06:55:24.200 18611 ERROR cinder.db.sqlalchemy.api [req-a3075d7f-0fc8-4b33-9868-9cd018198fed - - - - -] DBError detected when purging from volumes: (pymysql.err.IntegrityError) (1451, u'Cannot delete or update a parent row: a foreign key constraint fails (`cinder`.`volume_admin_metadata`, CONSTRAINT `volume_admin_metadata_ibfk_1` FOREIGN KEY (`volume_id`) REFERENCES `volumes` (`id`))') [SQL: u'DELETE FROM volumes WHERE volumes.deleted_at < %(deleted_at_1)s'] [parameters: {u'deleted_at_1': datetime.datetime(2017, 3, 8, 5, 55, 24, 132310)}].

May someone please have a look.

Changed in cinder:
assignee: nobody → Karthik Prabhu Vinod (karthik-prabhu-vinod)
Revision history for this message
Karthik Prabhu Vinod (karthik-prabhu-vinod) wrote :

Tested this on Master..

I see from the logs of db Purge command that volume_metadata & volume_admin_metadata are purged before the volumes itself.

So all the rows in volume_admin_metadata marked as (deleted, deleted_at) should be purged before the volumes themselves.

I believe there must have been a delay between a.) & b.):
a.) marking volume as deleted
b.) marking volume_admin_metadata as deleted

Because of this purge command purged the volume before volume_admin_metadata. I dont think this would happen quite often. Want to wait and hear from others!

Revision history for this message
Christoph Fiehe (fiehe) wrote :

That is really strange... The command "cinder-manage db purge 1" fails deterministically in my environment.

Revision history for this message
Karthik Prabhu Vinod (karthik-prabhu-vinod) wrote :

does it fail with the same error as above this time? because the command succeeds for me

Revision history for this message
Christoph Fiehe (fiehe) wrote :

Yes, it fails with the same error. The script always outputs:

$ cinder-manage db purge 1
Purge command failed, check cinder-manage logs for more details.
$ cinder-manage db purge 1
Purge command failed, check cinder-manage logs for more details.
$ cinder-manage db purge 1
Purge command failed, check cinder-manage logs for more details.

cinder-manage.log:
2017-03-27 09:34:42.849 19936 ERROR cinder.db.sqlalchemy.api [req-e2add18d-a771-4ad8-a94d-e1fad2d7f65c - - - - -] DBError detected when purging from volumes: (pymysql.err.IntegrityError) (1451, u'Cannot delete or update a parent row: a foreign key constraint fails (`cinder`.`volume_admin_metadata`, CONSTRAINT `volume_admin_metadata_ibfk_1` FOREIGN KEY (`volume_id`) REFERENCES `volumes` (`id`))') [SQL: u'DELETE FROM volumes WHERE volumes.deleted_at < %(deleted_at_1)s'] [parameters: {u'deleted_at_1': datetime.datetime(2017, 3, 26, 7, 34, 42, 760396)}].
2017-03-27 09:36:32.461 20388 ERROR cinder.db.sqlalchemy.api [req-b75547eb-42cc-4d86-9e22-3d071e236318 - - - - -] DBError detected when purging from volumes: (pymysql.err.IntegrityError) (1451, u'Cannot delete or update a parent row: a foreign key constraint fails (`cinder`.`volume_admin_metadata`, CONSTRAINT `volume_admin_metadata_ibfk_1` FOREIGN KEY (`volume_id`) REFERENCES `volumes` (`id`))') [SQL: u'DELETE FROM volumes WHERE volumes.deleted_at < %(deleted_at_1)s'] [parameters: {u'deleted_at_1': datetime.datetime(2017, 3, 26, 7, 36, 32, 430066)}].
2017-03-27 09:36:37.376 20413 ERROR cinder.db.sqlalchemy.api [req-75dfee8e-48f7-44a2-b660-0b9e3137dd59 - - - - -] DBError detected when purging from volumes: (pymysql.err.IntegrityError) (1451, u'Cannot delete or update a parent row: a foreign key constraint fails (`cinder`.`volume_admin_metadata`, CONSTRAINT `volume_admin_metadata_ibfk_1` FOREIGN KEY (`volume_id`) REFERENCES `volumes` (`id`))') [SQL: u'DELETE FROM volumes WHERE volumes.deleted_at < %(deleted_at_1)s'] [parameters: {u'deleted_at_1': datetime.datetime(2017, 3, 26, 7, 36, 37, 305832)}].

Revision history for this message
TommyLike (hu-husheng) wrote :

The possible issues are maybe that you have incorrect db records,(volume_metadata records are not marked soft deleted when you soft-deleting the corresponding volume records. that could lead to this exception), or maybe worse, this line [1] doesn't work at your environment :), if you'd like, you can contact me around (UTC+8) 9:00~21:00, tommylikehu

[1] https://github.com/openstack/cinder/blob/master/cinder/db/sqlalchemy/api.py#L6070

Revision history for this message
Christoph Fiehe (fiehe) wrote :

Yes, you are right ;-) The "deleted" flag is not set in the corresponding entry in "volume_admin_metadata":

INSERT INTO `volume_admin_metadata` (`created_at`, `updated_at`, `deleted_at`, `deleted`, `id`, `volume_id`, `key`, `value`) VALUES ('2017-03-16 07:31:24', NULL, NULL, 0, 3, '241a868f-a6f2-4a45-beb8-6fcd800c2ceb', 'readonly', 'False');

How can I solve the issue. I am using a Ceph RDB as Cinder backend. Do I miss something?

cinder.conf
[DEFAULT]
api_paste_confg = /etc/cinder/api-paste.ini
auth_strategy = keystone
backup_ceph_chunk_size = 134217728
backup_ceph_conf = /etc/ceph/ceph.conf
backup_ceph_pool = backups
backup_ceph_stripe_count = 0
backup_ceph_stripe_unit = 0
backup_ceph_user = cinder-backup
backup_driver = cinder.backup.drivers.ceph
debug = false
default_log_levels=amqp=WARN,amqplib=WARN,boto=WARN,qpid=WARN,sqlalchemy=WARN,suds=WARN,oslo.messaging=WARN,iso8601=WARN,requests.packages.urllib3.connectionpool=WARN,urllib3.connectionpool=WARN,websocket=WARN,requests.packages.urllib3.util.retry=WARN,urllib3.util.retry=WARN,keystonemiddleware=WARN,routes.middleware=WARN,stevedore=WARN,taskflow=WARN,keystoneauth=WARN,oslo.cache=WARN,dogpile.core.dogpile=WARN,oslo_service=WARN,cinder=WARN
enabled_backends = rbd
glance_api_servers = http://mos-image:9292
glance_api_version = 2
host = mos-volume01
log_dir = /var/log/cinder
memcached_servers = mos-memcache:11211
my_ip = 10.30.202.101
osapi_volume_listen = mos-volume01
osapi_volume_listen_port = 8776
restore_discard_excess_bytes = true
rootwrap_config = /etc/cinder/rootwrap.conf
state_path = /var/lib/cinder
transport_url = rabbit://cinder:secret@mos-rabbit01:5672,cinder:secret@mos-rabbit02:5672/openstack

[database]
connection = mysql+pymysql://cinder:secret@mos-controller/cinder
max_retries = -1

[keystone_authtoken]
auth_type = password
auth_uri = http://mos-identity:5000
auth_url = http://mos-identity:35357
memcached_servers = mos-memcache:11211
password = secret
project_domain_name = default
project_name = service
user_domain_name = default
username = cinder

[oslo_concurrency]
lock_path = /var/lock/cinder

[oslo_messaging_notifications]
driver = messagingv2

[oslo_messaging_rabbit]
amqp_durable_queues = true
rabbit_ha_queues = true
rabbit_retry_backoff = 2
rabbit_retry_interval = 1

[oslo_middleware]
enable_proxy_headers_parsing = true

[rbd]
# use VIP for backend_host
backend_host = mos-volume
rados_connect_timeout = -1
rbd_ceph_conf = /etc/ceph/ceph.conf
rbd_flatten_volume_from_snapshot = false
rbd_max_clone_depth = 5
rbd_pool = volumes
rbd_secret_uuid = 39915974-396b-4fbc-ad19-b158726f81b5
rbd_store_chunk_size = 4
rbd_user = cinder
report_discard_supported = true
volume_driver = cinder.volume.drivers.rbd.RBDDriver

Revision history for this message
Christoph Fiehe (fiehe) wrote :

I am not sure, maybe the problem is caused by this:
https://github.com/openstack/cinder/blob/master/cinder/db/sqlalchemy/api.py#L6082

In my environment the table "quality_of_service_specs" does not contain any entries.

Revision history for this message
TommyLike (hu-husheng) wrote :

Thanks, Jens Offenbach, it's not related to the backend. The best way to confirm that is clear all your volume_admin_metadata database records(if you could), and then try to reproduce it. If this happens again, so there are some bugs around your preceding operations.

Revision history for this message
Christoph Fiehe (fiehe) wrote :

I think, there is still an issue... When I create a volume, delete it, change the corresponding timestamps in the database and run "cinder-manage db purge" -> OK!

When I create a volume and attach it to a VM, detach it, remove it, change the corresponding timestamps in the database and run "cinder-manage db purge" -> Purge command failed, check cinder-manage logs for more details.

The table "volume_admin_metadata" contains an entry where "deleted" is set, but the column "deleted_at" is empty:

INSERT INTO `volume_admin_metadata` (`created_at`, `updated_at`, `deleted_at`, `deleted`, `id`, `volume_id`, `key`, `value`) VALUES ('2017-03-27 11:23:50', '2017-03-27 11:23:50', NULL, 1, 19, '6582f0e6-e703-4c39-bb01-769fb21a4c90', 'readonly', 'False');

I think the detach action does not add the "deleted_at" timestamp to the entry. It gets only marked as deleted. This causes "cinder-manage db purge" to fail.

That does not look right. What do you think?

Revision history for this message
Christoph Fiehe (fiehe) wrote :

I think there is a bug in:
https://github.com/openstack/cinder/blob/stable/ocata/cinder/db/sqlalchemy/api.py#L2623

Changing line 2623 from
"conditional_update(context, model, {'deleted': True}, expected_values)" to
"conditional_update(context, model, {'deleted': True, 'deleted_at': timeutils.utcnow()}, expected_values)"
solves the issue and the "deleted_at" timestamp gets set correctly. With this fix "cinder-manage db purge 1" works like a charm ;-)

Revision history for this message
Christoph Fiehe (fiehe) wrote :

Can someone confirm this?

Changed in cinder:
assignee: Karthik Prabhu Vinod (karthik-prabhu-vinod) → nobody
Revision history for this message
TommyLike (hu-husheng) wrote :

Thanks, Jens Offenbach, I will handle this.

Changed in cinder:
assignee: nobody → TommyLike (hu-husheng)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/450735

Christoph Fiehe (fiehe)
Changed in cinder:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/450735
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=f9012466db95a4e693291a7b5af21842a85843c2
Submitter: Jenkins
Branch: master

commit f9012466db95a4e693291a7b5af21842a85843c2
Author: TommyLike <email address hidden>
Date: Tue Mar 28 16:50:35 2017 +0800

    [BugFix]Cinder forgets to update 'deleted_at' when deleting

    Usually we would update the 'deleted_at' column when
    soft-deleting records, otherwise the 'db purge' command
    would fail because it depends on the 'deleted_at' column.

    Close-Bug: #1671354

    Change-Id: Ib302488d3a007df09a2e7ece40488c00bf732119

TommyLike (hu-husheng)
Changed in cinder:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/452884

Eric Harney (eharney)
Changed in cinder:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/ocata)

Reviewed: https://review.openstack.org/452884
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=1695e7c0cb49008032762c7116484b60a9cc9e82
Submitter: Jenkins
Branch: stable/ocata

commit 1695e7c0cb49008032762c7116484b60a9cc9e82
Author: TommyLike <email address hidden>
Date: Tue Mar 28 16:50:35 2017 +0800

    [BugFix]Cinder forgets to update 'deleted_at' when deleting

    Usually we would update the 'deleted_at' column when
    soft-deleting records, otherwise the 'db purge' command
    would fail because it depends on the 'deleted_at' column.

    Closes-Bug: #1671354

    Change-Id: Ib302488d3a007df09a2e7ece40488c00bf732119
    (cherry picked from commit f9012466db95a4e693291a7b5af21842a85843c2)

tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/453697

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/newton)

Reviewed: https://review.openstack.org/453697
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=8771277b0fd0dbbbc2331fce6fa4b4c45734924f
Submitter: Jenkins
Branch: stable/newton

commit 8771277b0fd0dbbbc2331fce6fa4b4c45734924f
Author: TommyLike <email address hidden>
Date: Tue Mar 28 16:50:35 2017 +0800

    [BugFix]Cinder forgets to update 'deleted_at' when deleting

    Usually we would update the 'deleted_at' column when
    soft-deleting records, otherwise the 'db purge' command
    would fail because it depends on the 'deleted_at' column.

    Closes-Bug: #1671354

    Change-Id: Ib302488d3a007df09a2e7ece40488c00bf732119
    (cherry picked from commit f9012466db95a4e693291a7b5af21842a85843c2)
    (cherry picked from commit 1695e7c0cb49008032762c7116484b60a9cc9e82)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 9.1.4

This issue was fixed in the openstack/cinder 9.1.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 10.0.2

This issue was fixed in the openstack/cinder 10.0.2 release.

TommyLike (hu-husheng)
Changed in cinder:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.