Metadata keys are not case sensitive

Bug #1538011 reported by hgangwx
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Low
Rajesh Tailor

Bug Description

[Summary]
Unexpected API Error returned when set metadata for aggregate

[Topo]
devstack all-in-one node

[Description and expect result]
no Unexpected API Error returned when set metadata for aggregate

[Reproduceable or not]
reproduceable

[Recreate Steps]

On devstack:

> nova aggregate-create agg1
> nova aggregate-set-metadata agg1 abc=1
> nova aggregate-set-metadata agg1 ABC=2

1) create an aggregate, with metadata "abc":
root@45-59:~# nova aggregate-details agg1
+----+------+-------------------+-------+----------+
| Id | Name | Availability Zone | Hosts | Metadata |
+----+------+-------------------+-------+----------+
| 4 | agg1 | - | | 'abc=1' |
+----+------+-------------------+-------+----------+
root@45-59:~#

2)set metadata of the aggregate as "ABC", upper case of the
existing metadata "abc", an Unexpected API Error returned:
root@45-59:~# nova aggregate-set-metadata agg1 ABC=2
ERROR (ClientException): Unexpected API Error. Please report
 this at http://bugs.launchpad.net/nova/ and attach the Nova API
log if possible.<type 'exceptions.KeyError'> (HTTP 500) (Request-ID:
req-b45dddb2-24e6-4b8f-8901-2ed250ec787a)
root@45-59:~#

[Configration]
reproduceable bug, no need

[logs]
reproduceable bug, no need

[Root cause anlyze or debug inf]
reproduceable bug

[Attachment]
None

Tags: api compute db
Revision history for this message
Augustina Ragwitz (auggy) wrote :

I was able to reproduce this behavior following the steps above and also by modifying data in the functional tests. This only happens when you have two keys of the same name where one is lowercase and one is uppercase. I looked at the aggregate code and didn't see any obvious normalization or modification of keys, so I suspect this may be something in the db layer (possibly oslo?) that is normalizing the key values. That being said, this seems like a pretty extreme edge case so unless someone can share a situation where this is a huge deal, this will most likely be a low priority bug. At minimum, we might need a documentation patch to indicate that metadata keys must be case-insensitively uniquely named.

Changed in nova:
status: New → Confirmed
Revision history for this message
Augustina Ragwitz (auggy) wrote : Re: Aggregate metadata keys are not case sensitive

When setting a key that is uniquely named regardless of the case, the update works correctly. The error only happens when creating a metadata key in one case (eg all lower) and then adding a key of the same name in another case (eg all upper, or a mix of upper and lower)

tags: added: compute
summary: - Unexpected API Error returned when set metadata for aggregate
+ Aggregate metadata keys are not case sensitive
Revision history for this message
Augustina Ragwitz (auggy) wrote :

We should also have test coverage for this scenario in the functional tests if we do decide to fix it.

Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote :

very similar to this one is bug 1535224

tags: added: db
Sean Dague (sdague)
description: updated
Revision history for this message
Augustina Ragwitz (auggy) wrote :
Download full text (3.2 KiB)

2016-01-28 02:05:58.388 ERROR nova.api.openstack.extensions [req-c14d351d-5d60-4986-ab95-22600ecb5287 admin admin] Unexpected exception in API method
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions Traceback (most recent call last):
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions File "/opt/stack/nova/nova/api/openstack/extensions.py", line 478, in wrapped
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions return f(*args, **kwargs)
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions File "/opt/stack/nova/nova/api/validation/__init__.py", line 73, in wrapper
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions return func(*args, **kwargs)
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions File "/opt/stack/nova/nova/api/openstack/compute/aggregates.py", line 192, in _set_metadata
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions id, metadata)
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions File "/opt/stack/nova/nova/exception.py", line 110, in wrapped
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions payload)
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 204, in __exit__
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions six.reraise(self.type_, self.value, self.tb)
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions File "/opt/stack/nova/nova/exception.py", line 89, in wrapped
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions return f(self, context, *args, **kw)
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions File "/opt/stack/nova/nova/compute/api.py", line 3547, in update_aggregate_metadata
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions aggregate.update_metadata(metadata)
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 221, in wrapper
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions return fn(self, *args, **kwargs)
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions File "/opt/stack/nova/nova/objects/aggregate.py", line 123, in update_metadata
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions db.aggregate_metadata_add(self._context, self.id, to_add)
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions File "/opt/stack/nova/nova/db/api.py", line 1784, in aggregate_metadata_add
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions IMPL.aggregate_metadata_add(context, aggregate_id, metadata, set_delete)
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 233, in wrapper
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions return f(context, aggregate_id, *args, **kwargs)
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 5677, in aggregate_metadata_add
2016-01-28 02:05:58.388 TRACE nova.api.openstack.extensions meta_ref.update({"value": metadata[key]})
2016-01-28 02:05:58.388 TRACE nova...

Read more...

Revision history for this message
Sean Dague (sdague) wrote :

The root cause of this is https://github.com/openstack/nova/blob/a37590c21bca175f642792b6707d2eb4e6349e51/nova/db/sqlalchemy/api.py#L5674

The issue is that _in is case insensitive for mysql, which means that in:

                    query = query.filter(
                        models.AggregateMetadata.key.in_(all_keys))
                    for meta_ref in query.all():
                        key = meta_ref.key
                        meta_ref.update({"value": metadata[key]})
                        already_existing_keys.add(key)

The all_keys is 'ABC' , but matches the existing 'abc'.

We probably just need an 'if key in metadata:' before the meta_ref.update

Changed in nova:
status: Confirmed → Triaged
importance: Undecided → High
Revision history for this message
Sean Dague (sdague) wrote :

Marked at high priority as this is a server 500 error, which we should never be doing.

Changed in nova:
assignee: nobody → Augustina Ragwitz (auggy)
Revision history for this message
Augustina Ragwitz (auggy) wrote :

In addition to fixing the issue in code, we need to add functional test coverage. The coverage should make sure the gate uses MySQL as the backend via the MySQLOpportunisticTestCase class. Adding this coverage will be an important template as it's likely similar issues with the "IN" being case-insensitive in MySQL are scattered across our codebase.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/283364

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/283868

Changed in nova:
status: Triaged → In Progress
Revision history for this message
Sean Dague (sdague) wrote : Re: Aggregate metadata keys are not case sensitive

The plot thickens....

There is a unique constraint on agg_id, key, deleted. For the intents and purposes of mysql:

1-abc-0 == 1-ABC-0

Which means that you can't have both keys.

Sean Dague (sdague)
tags: added: api
Revision history for this message
Augustina Ragwitz (auggy) wrote :

There is currently discussion on the mailing list about how to handle case sensitive payloads in the api. Once we know our direction in that regard, we can figure out next steps for fixing this bug.

Revision history for this message
Augustina Ragwitz (auggy) wrote :

Updated description and marked bug#1535224 as a duplicate. The issue here is that MySQL is not case sensitive when managing any unique text fields.

summary: - Aggregate metadata keys are not case sensitive
+ Metadata keys are not case sensitive
Revision history for this message
Augustina Ragwitz (auggy) wrote :

Per discussion, current fix strategy is to casefold in sqlalchemy/api.py and force keys to all lower case.

Revision history for this message
melanie witt (melwitt) wrote :

I don't really like the idea of casefolding data set by the user and can be read by the user. Do we really want the precedent of changing data provided by the user because MySQL? I understand this is the only way given the unique constraint on the key. Is removing the unique constraint an option?

If there's no a unique constraint, then we would have to add checking at the sqlalchemy/api.py layer that finds the exact match among multiple results, for example, FOO foo FoO as mentioned in IRC to ensure we read/update/delete the right record. And we can raise on our own in the case of duplicates.

I can understand the argument that this isn't a Big Deal for aggregate metadata being that the consumer of the data is Nova. But if we broaden the issue to say, user metadata, are we still going to want to change all the keys to lowercase? If someone has a program that will do something based on the value of a metadata key, their code may not find the same key they had set earlier. Example: user program sets MyMetadataKey=somevalue and then they try to access "MyMetadataKey" in a queried instance and get a KeyError because it's now "mymetadatakey".

Revision history for this message
Augustina Ragwitz (auggy) wrote :

That's a really good point about the end user impact of this change for user facing metadata. I'm inclined to agree with melwitt on this, for user facing data, if a user has specified a key to be a certain case, it should stay that case.

For aggregate metadata and this bug specifically, it makes sense to research why the unique constraint even exists at all on that key field. Once we know that, we can decide the best course to move forward.

In the meantime, I have a fix for the metadata key issue that illustrates the lowercase path, but I'm not comfortable with it until we've come to a decision about how we want to deal with the metadata case sensitivity issue for all of our metadata if anything for the sake of consistency.

Revision history for this message
Augustina Ragwitz (auggy) wrote :

I'll add comments to the change I have posted for this bug regarding this discussion and will research if we actually need the unique constraint on the aggregate metadata key field.

Revision history for this message
melanie witt (melwitt) wrote :

It's worth noting that the aggregate metadata is user-facing in a sense that there is an API for showing aggregate details, including the aggregate metadata [1]. The primary consumer of the metadata is Nova itself, but it's entirely possible for an end user to write code to create/update/read aggregate metadata. And I get wary of the idea of put data != get data for the end user.

[1] http://developer.openstack.org/api-ref-compute-v2.1.html#showAggregate

Revision history for this message
Augustina Ragwitz (auggy) wrote :

Thanks for the clarification, melanie. Yes, aggregate metadata is also potentially user facing in that we have API support making it so. I also agree that I'm not fan of modifying input.

Because this only affects new key creation in the case of aggregate metadata, if we can't do the database change, we could just throw a validation error for new keys that aren't all lowercase. Sean and I had originally discussed modifying the JSON validation to that effect. We would still need additional error checking to handle the duplicate case, but at least that would set the user expectation.

For the larger metadata case, I'm of the opinion that users should be able to set the keys to whatever capitalization strategy they want. If we can add a constraint to prevent duplicate key names in some way, we could maintain an internal mapping that preserves the original case. It seems like over-engineering and added complexity, but that's my thought on it. Of course the ideal solution is to just fix the columns in the database.

Revision history for this message
melanie witt (melwitt) wrote :

After discussing on IRC and thinking about it, I understand the situation better in that the only way we could allow case sensitive metadata keys (that is, let users set whatever they want) is if we rewrite our existing metadata key queries to do a preliminary read-only query first (no more SELECT FOR UPDATE, IN, etc), filter in Python, and then act upon an exactly matching record. It also requires that we not have a unique constraint on the column. We would be constrained in the way we could query metadata keys and set column constraints and it's complex and error-prone, for not a lot of gain.

My primary concern was the idea of casefolding the keys. With the suggestion of raising an error if an API caller attempts to set a non-lowercase key, my concern is addressed.

As you mentioned on the review, I think we can add the validation and leave the backend as-is. That way previously set CamelCase keys continue to work and we avoid future problems.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Augustina Ragwitz (<email address hidden>) on branch: master
Review: https://review.openstack.org/283868
Reason: Spec is up that addresses this issue, I'll work on that instead

Revision history for this message
Augustina Ragwitz (auggy) wrote :
Changed in nova:
importance: High → Medium
Changed in nova:
assignee: Augustina Ragwitz (auggy) → nobody
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Sean Dague (<email address hidden>) on branch: master
Review: https://review.openstack.org/283364
Reason: This review is > 6 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Changed in nova:
status: In Progress → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Ken'ichi Ohmichi (<email address hidden>) on branch: master
Review: https://review.openstack.org/380675

Sean Dague (sdague)
Changed in nova:
assignee: nobody → Augustina Ragwitz (auggy)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/484957

Matt Riedemann (mriedem)
Changed in nova:
assignee: Augustina Ragwitz (auggy) → nobody
status: In Progress → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/484957
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=22f4aedb857073cf29923c78d5185df4e2e03539
Submitter: Jenkins
Branch: master

commit 22f4aedb857073cf29923c78d5185df4e2e03539
Author: Matt Riedemann <email address hidden>
Date: Tue Jul 18 17:47:22 2017 -0400

    Do not mention that tags are case sensitive in docs

    Because MySQL is case insensitive by default, and this
    is something that depends on the database backend in the
    cloud, let's not mention that tags are case sensitive in
    the API.

    Change-Id: I6efa9d6a5c598ac7a5c898d63b6a4b1934560b80
    Related-Bug: #1538011

Revision history for this message
Sean Dague (sdague) wrote :

Found open reviews for this bug in gerrit, setting to In Progress.

review: https://review.openstack.org/283364 in branch: master

Changed in nova:
status: Confirmed → In Progress
assignee: nobody → Augustina Ragwitz (auggy)
Revision history for this message
Alex Xu (xuhj) wrote :
Rajesh Tailor (ratailor)
Changed in nova:
assignee: Augustina Ragwitz (auggy) → Rajesh Tailor (ratailor)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/504885

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Sean Dague (<email address hidden>) on branch: master
Review: https://review.openstack.org/283364

Rajesh Tailor (ratailor)
Changed in nova:
assignee: Rajesh Tailor (ratailor) → nobody
Changed in nova:
assignee: nobody → Nalini Varshney (varshneyg)
Revision history for this message
Nalini Varshney (varshneyg) wrote :
melanie witt (melwitt)
Changed in nova:
importance: Medium → Low
Revision history for this message
melanie witt (melwitt) wrote :

I've lowered the importance level on this bug to Low as I don't think it's a critical problem.

I also wanted to add a note on here since new patch proposals have occurred recently.

I think the proposals to change the database collation to a case-sensitive type are not a 100% great approach because I don't think that end users actually *want* to have "abc" and "ABC" be considered two distinct keys. I think that would lead to confusion and harder-to-debug scenarios where nova isn't behaving as expected because "abc" does not match "ABC".

Instead, it might be better to take a different approach and lower() metadata keys only during comparisons in the API, that way metadata would be considered case-insensitive without actually lowercasing any data in the database. That would mean for example, that an existing key "abc" would receive an update if the user passed "ABC=2". When this happens, we would need to update the key to "ABC" in the database so that when the user does a GET aggregate, they will see the key they set before as "ABC" (imagine they are a script parsing out what they PUT earlier).

This way we would not be modifying any user-provided data and also not contributing to potential confusion around "abc" and "ABC" being distinct.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by "sean mooney <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/nova/+/504885
Reason: based on the age of this and https://bugs.launchpad.net/nova/+bug/1538011/comments/32 im abandoing this patch to clean up open gerrit reivews.

im open to fixing hte bug following melanies suggestion of lowercaseing the keys in our code.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/873901

Rajesh Tailor (ratailor)
Changed in nova:
assignee: Nalini Varshney (varshneyg) → Rajesh Tailor (ratailor)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.