2015-09-17 08:32:56 |
Szymon Datko |
description |
I was having fun with Cinder, when I have spotted a bug in a mechanism that is being used to grant access for projects (aka 'tenants') to volume types. To be more specific: the problem is related to removing project access.
How to reproduce the bug:
1. In a loop, about 150 times:
a) create non-public volume type ( is_public=False )
b) grant access for project to newly created type
2. Try to remove project access to one of recently created type
3. You can expect from python-cinderclient an error 500 (~ Server is incapable of performing operation)
4. On Cinder's log there is an error: "Out of range value for column 'deleted' at row 1"
Packages' versions on my host:
cinder-api 1:2015.1.1-0ubuntu2~cloud2
cinder-common 1:2015.1.1-0ubuntu2~cloud2
cinder-scheduler 1:2015.1.1-0ubuntu2~cloud2
cinder-volume 1:2015.1.1-0ubuntu2~cloud2
python-cinder 1:2015.1.1-0ubuntu2~cloud2
python-cinderclient 1:1.3.1-2
python-oslo-db 1.7.1-0ubuntu2~cloud0
Description:
The problem is related to database - column 'deleted' on table 'volume_type_projects'.
Here you can see that it is created as *Boolean* type:
https://github.com/openstack/cinder/blob/master/cinder/db/sqlalchemy/migrate_repo/versions/032_add_volume_type_projects.py#L36
During conversion process, the result of db-sync in database is type *tinyint(1)*:
mysql> desc volume_type_projects;
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| created_at | datetime | YES | | NULL | |
| updated_at | datetime | YES | | NULL | |
| deleted_at | datetime | YES | | NULL | |
| volume_type_id | varchar(36) | YES | MUL | NULL | |
| project_id | varchar(255) | YES | | NULL | |
| deleted | tinyint(1) | YES | | NULL | |
+----------------+--------------+------+-----+---------+----------------+
Type *tinyint(1)* is, in fact, a reason why the bug does not occur before 128 access rules.
Now, what in fact causes bug is usage of soft_delete() function during access removing. See it here:
https://github.com/openstack/cinder/blob/master/cinder/db/sqlalchemy/api.py#L2776
The mentioned function is definied in *oslo.db* as:
https://github.com/openstack/oslo.db/blob/master/oslo_db/sqlalchemy/orm.py#L28
So, what happens when project access is removed? To the column 'deleted' in table 'volume_type_projects' the *row id* is assigned instead of *Boolean* value. Due to conversion from *Boolean* to *tinyint(1)* on db-sync, the problem is not spotted on fresh installation.
Possible solutions:
a) replace soft_delete() function with just update() - like it is done in rest of the code, see example here:
https://github.com/openstack/cinder/blob/master/cinder/db/sqlalchemy/api.py#L2710
b) upgrade the db schema |
I was having fun with Cinder, when I have spotted a bug in a mechanism that is being used to grant access for projects (aka 'tenants') to volume types. To be more specific: the problem is related to removing project access.
How to reproduce the bug:
1. In a loop, about 150 times:
a) create non-public volume type ( is_public=False )
b) grant access for project to newly created type
2. Try to remove project access to one of recently created type
3. You can expect from python-cinderclient an error 500 (~ Server is incapable of performing operation)
4. On Cinder's log there is an error: "Out of range value for column 'deleted' at row 1"
Packages' versions on my host:
cinder-api 1:2015.1.1-0ubuntu2~cloud2
cinder-common 1:2015.1.1-0ubuntu2~cloud2
cinder-scheduler 1:2015.1.1-0ubuntu2~cloud2
cinder-volume 1:2015.1.1-0ubuntu2~cloud2
python-cinder 1:2015.1.1-0ubuntu2~cloud2
python-cinderclient 1:1.3.1-2
python-oslo-db 1.7.1-0ubuntu2~cloud0
Description:
The problem is related to database - column 'deleted' on table 'volume_type_projects'.
Here you can see that it is created as *Boolean* type:
https://github.com/openstack/cinder/blob/master/cinder/db/sqlalchemy/migrate_repo/versions/032_add_volume_type_projects.py#L36
During conversion process, the result of db-sync in database is type *tinyint(1)*:
mysql> desc volume_type_projects;
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| created_at | datetime | YES | | NULL | |
| updated_at | datetime | YES | | NULL | |
| deleted_at | datetime | YES | | NULL | |
| volume_type_id | varchar(36) | YES | MUL | NULL | |
| project_id | varchar(255) | YES | | NULL | |
| deleted | tinyint(1) | YES | | NULL | |
+----------------+--------------+------+-----+---------+----------------+
Type *tinyint(1)* is, in fact, a reason why the bug does not occur before 128 access rules.
Now, what in fact causes bug is usage of soft_delete() function during access removing. See it here:
https://github.com/openstack/cinder/blob/master/cinder/db/sqlalchemy/api.py#L2776
The mentioned function is definied in *oslo.db* as:
https://github.com/openstack/oslo.db/blob/master/oslo_db/sqlalchemy/orm.py#L28
So, what happens when project access is removed? To the column 'deleted' in table 'volume_type_projects' the *row id* is assigned instead of *Boolean* value. Due to conversion from *Boolean* to *tinyint(1)* on db-sync, the problem is not spotted on fresh installation.
Possible solutions:
a) replace soft_delete() function with just update() - like it is done in rest of the code, see example here:
https://github.com/openstack/cinder/blob/master/cinder/db/sqlalchemy/api.py#L2710
b) upgrade the db schema |
|