Inconsistency for resource collumn between quota_usages and reservations table

Bug #1948916 reported by Jan Wasilewski
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
New
Medium
Unassigned

Bug Description

Hi,

we faced an issue due to size inconsistency of collumn "resource" between tables reservations and quota_usages which leads to situation that one galera cluster node is jumping off the cluster with error:

2021-10-21T10:39:52.564192Z 17 [Warning] [MY-000000] [WSREP] BF applier failed to open_and_lock_tables: 1709, fatal: false wsrep = (exec_mode: high priority conflict_state: aborted seqno: 1695730692)
2021-10-21T10:39:52.564362Z 17 [ERROR] [MY-010584] [Repl] Slave SQL: Error executing row event: 'Index column size too large. The maximum column size is 767 bytes.', Error_code: MY-001709
2021-10-21T10:39:52.564420Z 17 [Warning] [MY-000000] [WSREP] Event 3 Write_rows apply failed: 1, seqno 1695730692
2021-10-21T10:39:52.566026Z 0 [Note] [MY-000000] [Galera] Member 1(mysql-03) initiates vote on ab091498-b8a1-11e7-9427-aafaf67fded0:1695730692,f12f343044680bae: Index column size too large. The maximum column size is 767 bytes., Error_code: 1709; Table 'cinder.quota_usages' doesn't exist, Error_code: 1146;
2021-10-21T10:39:52.566169Z 0 [Note] [MY-000000] [Galera] Votes over ab091498-b8a1-11e7-9427-aafaf67fded0:1695730692:

or

2021-10-20T22:01:23.077847Z 16 [ERROR] [MY-010584] [Repl] Slave SQL: Could not execute Write_rows event on table cinder.reservations; Index column size too large. The maximum column size is 767 bytes., Error_code: 1709; Index column size too large. The maximum column size is 767 bytes., Error_code: 1709; Cannot add or update a child row: a foreign key constraint fails (`cinder`.`reservations`, CONSTRAINT `reservations_ibfk_1` FOREIGN KEY (`usage_id`) REFERENCES `quota_usages` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT), Error_code: 1452; handler error HA_ERR_NO_REFERENCED_ROW; the event's master log FIRST, end_log_pos 0, Error_code: MY-001709
2021-10-20T22:01:23.078047Z 16 [Warning] [MY-000000] [WSREP] Event 3 Write_rows apply failed: 151, seqno 1694993383
2021-10-20T22:01:23.081582Z 0 [Note] [MY-000000] [Galera] Member 1(mysql-03) initiates vote on ab091498-b8a1-11e7-9427-aafaf67fded0:1694993383,9fcebea142a93ccc: Index column size too large. The maximum column size is 767 bytes., Error_code: 1709; Index column size too large. The maximum column size is 767 bytes., Error_code: 1709; Cannot add or update a child row: a foreign key constraint fails (`cinder`.`reservations`, CONSTRAINT `reservations_ibfk_1` FOREIGN KEY (`usage_id`) REFERENCES `quota_usages` (`id`) ON DELETE RESTRICT ON UPDATE RESTRICT), Error_code: 1452;

This problem was not visible before Stein release and occurred after upgrade to Stein. We checked that cinder db schema changed a bit and resource size inside quota_usages increased from 255 to 300, when the same collumn remain unchanged in reservations table. That was due to two cinder bugs:
https://bugs.launchpad.net/cinder/+bug/1608849
https://bugs.launchpad.net/cinder/+bug/1798327

and changed it here:
https://review.opendev.org/c/openstack/cinder/+/611530/

I checked this as well in upstream and this inconsistency is still visible there. As volume creation/deletion is executing actions on both tables and can lead to such misbehaviour, it would be good to keep it consistent.

A problem started to occur with such packages:

ii cinder-api 2:14.3.1-0ubuntu1~cloud1 all Cinder storage service - API server
ii cinder-backup 2:14.3.1-0ubuntu1~cloud1 all Cinder storage service - Scheduler server
ii cinder-common 2:14.3.1-0ubuntu1~cloud1 all Cinder storage service - common files
ii cinder-scheduler 2:14.3.1-0ubuntu1~cloud1 all Cinder storage service - Scheduler server
ii cinder-volume 2:14.3.1-0ubuntu1~cloud1 all Cinder storage service - Volume server
ii python-cinderclient 1:4.1.0-0ubuntu1~cloud0 all Python bindings to the OpenStack Volume API - Python 2.x
ii python3-cinder 2:14.3.1-0ubuntu1~cloud1 all Cinder Python 3 libraries
ii python3-cinderclient 1:4.1.0-0ubuntu1~cloud0 all Python bindings to the OpenStack Volume API - Python 3.x

Tags: quotas
summary: - [cinder] Inconsistency for resource collumn between quota_usages and
- reservations table
+ Inconsistency for resource collumn between quota_usages and reservations
+ table
Changed in cinder:
importance: Undecided → Medium
tags: added: quotas
Revision history for this message
Gorka Eguileor (gorka) wrote :

I believe the issue you are facing is related to the new size of the `resource` field on the `cinder.quota_usages` table, and not on the discrepancy of the tables (which would be a different bug showing different errors).

This should have failed on the DB sync phase, as you can see the table doesn't seem to exist: "Table 'cinder.quota_usages' doesn't exist"

Either your DB is old or some global settings don't let it create indexes for VARCHAR(300). Global settings to look at are `innodb_large_prefix`, `innodb_default_row_format`, and `innodb_file_format`.

Revision history for this message
Gorka Eguileor (gorka) wrote :

I have create another bug to fix the discrepancy of the column size: https://bugs.launchpad.net/cinder/+bug/1948962

Revision history for this message
Jan Wasilewski (janwasilewski) wrote :

Unfortunately or fortunately this problem was not observed during DB sync phase, we executed our upgrade procedure on top of three environments and this problem was visible only in one environment, when we realized that one from three mysql galera nodes is out of cluster. Then we started to dig more and we found such issue there.

I also collected table schema from working nodes, it seems everything is working there correctly, but difference in size is observed:

mysql> describe quota_usages;
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| created_at | datetime | YES | | NULL | |
| updated_at | datetime | YES | | NULL | |
| deleted_at | datetime | YES | | NULL | |
| deleted | tinyint(1) | YES | | NULL | |
| id | int | NO | PRI | NULL | auto_increment |
| project_id | varchar(255) | YES | MUL | NULL | |
| resource | varchar(300) | YES | | NULL | |
| in_use | int | NO | | NULL | |
| reserved | int | NO | | NULL | |
| until_refresh | int | YES | | NULL | |
+---------------+--------------+------+-----+---------+----------------+
10 rows in set (0.00 sec)

mysql> describe reservations;
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| created_at | datetime | YES | | NULL | |
| updated_at | datetime | YES | | NULL | |
| deleted_at | datetime | YES | | NULL | |
| deleted | tinyint(1) | YES | MUL | NULL | |
| id | int | NO | PRI | NULL | auto_increment |
| uuid | varchar(36) | NO | | NULL | |
| usage_id | int | YES | MUL | NULL | |
| project_id | varchar(255) | YES | MUL | NULL | |
| resource | varchar(255) | YES | | NULL | |
| delta | int | NO | | NULL | |
| expire | datetime | YES | | NULL | |
| allocated_id | int | YES | MUL | NULL | |
+--------------+--------------+------+-----+---------+----------------+
12 rows in set (0.00 sec)

I need to mention that mysql version is the same for all of the galera nodes.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.