Activity log for bug #916386

Date Who What changed Old value New value Message
2012-01-14 05:09:24 Ewan Mellor bug added bug
2012-01-14 05:20:43 Ewan Mellor description I am using the SQLalchemy backend, MySQL 5.0.77, SQLAlchemy 0.6.5, Keystone latest. I am making create-tenant requests (POST to /v2.0/tenants) as a global admin (i.e. no tenant, admin rights). The first one succeeds as expected. The second one fails with 401 Forbidden and the log message 'User <UUID> failed check - did not have role 1'. What's happening is this: 1. the first create-tenant request is causing a new tenants row to be created in the database, with an autoincrementing integer in the tenants.id column. 2. When the second request comes in, the access check is calling keystone.backends.sqlalchemy.api.list_tenant_roles_for_user (via keystone.managers.grant.list_tenant_roles_for_user). That in turn is resolving user_roles.tenant_id (legitimately NULL) by calling api.TENANT.id_to_uid(None). 3. TENANT.id_to_uid has no shortcut for id=None. This leads to it making a database query equivalent to "SELECT uid FROM tenant WHERE id IS NULL". This is where things go wrong. This query _should_ return zero rows, because id IS NULL should always be false on an autoincremented integer. However, the query response includes the new tenants row created in step 1. In the database, id has been populated correctly, and is not NULL. SQLalchemy doesn't see that though. The session is clearly caching a row which has the new data submitted to the database, but _not_ the value in the auto-populated id column. 4. This results in TENANT.id_to_uid(None) returning the UID of the most recently created tenant. Which isn't good. 5. This state disappears immediately it seems -- subsequent calls to id_to_uuid(None) return None as expected. I can see a couple of fixes for this: 1. Make id_to_uid(None) shortcut to return None, and therefore never hit the database in that case. This is a good idea for performance reasons anyway (so queries of the form 'id IS NULL' should always return zero rows). The reason I'm not proposing that straight away is that it feels like papering over a dangerous crack. There may be other situations that I haven't found yet where the database layer returns the wrong thing. 2. Change sqlalchemy.Driver._init_session_maker to set autocommit=False, not True. This fixes the problem. I have no idea what the impact would be on the general transactional correctness of the rest of the system though, so I'd need someone who knows how this layer is supposed to work to have a look at that one. I am using the SQLalchemy backend, MySQL 5.0.77, SQLAlchemy 0.6.5, Keystone latest. Note that MySQL and SQLAlchemy may be a bit old (I am using CentOS 5.5 as the base platform, with some backports) so if this turns out not to be a problem on a more modern platform then we may want to just ignore this bug. I am making create-tenant requests (POST to /v2.0/tenants) as a global admin (i.e. no tenant, admin rights). The first one succeeds as expected. The second one fails with 401 Forbidden and the log message 'User <UUID> failed check - did not have role 1'. What's happening is this: 1. the first create-tenant request is causing a new tenants row to be created in the database, with an autoincrementing integer in the tenants.id column. 2. When the second request comes in, the access check is calling keystone.backends.sqlalchemy.api.list_tenant_roles_for_user (via keystone.managers.grant.list_tenant_roles_for_user). That in turn is resolving user_roles.tenant_id (legitimately NULL) by calling api.TENANT.id_to_uid(None). 3. TENANT.id_to_uid has no shortcut for id=None. This leads to it making a database query equivalent to "SELECT uid FROM tenant WHERE id IS NULL". This is where things go wrong. This query _should_ return zero rows, because id IS NULL should always be false on an autoincremented integer. However, the query response includes the new tenants row created in step 1. In the database, id has been populated correctly, and is not NULL. SQLalchemy doesn't see that though. The session is clearly caching a row which has the new data submitted to the database, but _not_ the value in the auto-populated id column. 4. This results in TENANT.id_to_uid(None) returning the UID of the most recently created tenant. Which isn't good. 5. This state disappears immediately it seems -- subsequent calls to id_to_uuid(None) return None as expected. I can see a couple of fixes for this: 1. Make id_to_uid(None) shortcut to return None, and therefore never hit the database in that case. This is a good idea for performance reasons anyway (so queries of the form 'id IS NULL' should always return zero rows). The reason I'm not proposing that straight away is that it feels like papering over a dangerous crack. There may be other situations that I haven't found yet where the database layer returns the wrong thing. 2. Change sqlalchemy.Driver._init_session_maker to set autocommit=False, not True. This fixes the problem. I have no idea what the impact would be on the general transactional correctness of the rest of the system though, so I'd need someone who knows how this layer is supposed to work to have a look at that one.
2012-01-14 05:21:17 Ewan Mellor description I am using the SQLalchemy backend, MySQL 5.0.77, SQLAlchemy 0.6.5, Keystone latest. Note that MySQL and SQLAlchemy may be a bit old (I am using CentOS 5.5 as the base platform, with some backports) so if this turns out not to be a problem on a more modern platform then we may want to just ignore this bug. I am making create-tenant requests (POST to /v2.0/tenants) as a global admin (i.e. no tenant, admin rights). The first one succeeds as expected. The second one fails with 401 Forbidden and the log message 'User <UUID> failed check - did not have role 1'. What's happening is this: 1. the first create-tenant request is causing a new tenants row to be created in the database, with an autoincrementing integer in the tenants.id column. 2. When the second request comes in, the access check is calling keystone.backends.sqlalchemy.api.list_tenant_roles_for_user (via keystone.managers.grant.list_tenant_roles_for_user). That in turn is resolving user_roles.tenant_id (legitimately NULL) by calling api.TENANT.id_to_uid(None). 3. TENANT.id_to_uid has no shortcut for id=None. This leads to it making a database query equivalent to "SELECT uid FROM tenant WHERE id IS NULL". This is where things go wrong. This query _should_ return zero rows, because id IS NULL should always be false on an autoincremented integer. However, the query response includes the new tenants row created in step 1. In the database, id has been populated correctly, and is not NULL. SQLalchemy doesn't see that though. The session is clearly caching a row which has the new data submitted to the database, but _not_ the value in the auto-populated id column. 4. This results in TENANT.id_to_uid(None) returning the UID of the most recently created tenant. Which isn't good. 5. This state disappears immediately it seems -- subsequent calls to id_to_uuid(None) return None as expected. I can see a couple of fixes for this: 1. Make id_to_uid(None) shortcut to return None, and therefore never hit the database in that case. This is a good idea for performance reasons anyway (so queries of the form 'id IS NULL' should always return zero rows). The reason I'm not proposing that straight away is that it feels like papering over a dangerous crack. There may be other situations that I haven't found yet where the database layer returns the wrong thing. 2. Change sqlalchemy.Driver._init_session_maker to set autocommit=False, not True. This fixes the problem. I have no idea what the impact would be on the general transactional correctness of the rest of the system though, so I'd need someone who knows how this layer is supposed to work to have a look at that one. I am using the SQLalchemy backend, MySQL 5.0.77, SQLAlchemy 0.6.5, Keystone latest. Note that MySQL and SQLAlchemy may be a bit old (I am using CentOS 5.5 as the base platform, with some backports) so if this turns out not to be a problem on a more modern platform then we may want to just ignore this bug. I am making create-tenant requests (POST to /v2.0/tenants) as a global admin (i.e. no tenant, admin rights). The first one succeeds as expected. The second one fails with 401 Forbidden and the log message 'User <UUID> failed check - did not have role 1'. What's happening is this: 1. the first create-tenant request is causing a new tenants row to be created in the database, with an autoincrementing integer in the tenants.id column. 2. When the second request comes in, the access check is calling keystone.backends.sqlalchemy.api.list_tenant_roles_for_user (via keystone.managers.grant.list_tenant_roles_for_user). That in turn is resolving user_roles.tenant_id (legitimately NULL) by calling api.TENANT.id_to_uid(None). 3. TENANT.id_to_uid has no shortcut for id=None. This leads to it making a database query equivalent to "SELECT uid FROM tenant WHERE id IS NULL". This is where things go wrong. This query _should_ return zero rows, because id IS NULL should always be false on an autoincremented integer. However, the query response includes the new tenants row created in step 1. In the database, id has been populated correctly, and is not NULL. SQLalchemy doesn't see that though. The session is clearly caching a row which has the new data submitted to the database, but _not_ the value in the auto-populated id column. 4. This results in TENANT.id_to_uid(None) returning the UID of the most recently created tenant. Which isn't good. 5. This state disappears immediately it seems -- subsequent calls to id_to_uuid(None) return None as expected. I can see a couple of fixes for this: 1. Make id_to_uid(None) shortcut to return None, and therefore never hit the database in that case. This is a good idea for performance reasons anyway (queries of the form 'id IS NULL' should always return zero rows anyway). The reason I'm not proposing that straight away is that it feels like papering over a dangerous crack. There may be other situations that I haven't found yet where the database layer returns the wrong thing. 2. Change sqlalchemy.Driver._init_session_maker to set autocommit=False, not True. This fixes the problem. I have no idea what the impact would be on the general transactional correctness of the rest of the system though, so I'd need someone who knows how this layer is supposed to work to have a look at that one.
2012-01-26 17:17:18 Dolph Mathews keystone: milestone essex-4
2012-01-26 17:17:18 Dolph Mathews keystone: assignee Dolph Mathews (dolph)
2012-01-26 21:29:48 OpenStack Infra keystone: status New In Progress
2012-01-27 16:22:59 OpenStack Infra keystone: status In Progress Fix Committed
2012-02-19 18:30:12 Joseph Heck keystone: importance Undecided Medium
2012-02-29 10:12:55 Thierry Carrez keystone: status Fix Committed Fix Released
2012-04-05 08:26:49 Thierry Carrez keystone: milestone essex-4 2012.1