Timing database shouldn't use attrs for test_id uniquness

Bug #1416512 reported by Matthew Treinish on 2015-01-30
This bug affects 1 person
Affects Status Importance Assigned to Milestone

Bug Description

When storing timing data in the times.dbm with the file repository type if the testtools attrs on a test change it creates a new entry in the timing database. This is not desirable behavior since the attr is just metadata about the test we don't want to assume when the metadata changes it's a new test. Just like we don't create a new entry in the timing database if the tags in the subunit stream change between runs.

For example from 2 runs in the timing database I have 2 entries for each test 1 from before I added attrs and 1 from after:

tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance[compute,image,network]: 62.51323
tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance: 70.8244

You can see the logs here: http://logs.openstack.org/19/149719/4/check/check-tempest-dsvm-full/2ea9905/logs/testrepository/

I propose that we strip the attrs from the test_id before we store timing data in the database. This way we'll be using a consistent test_ids between runs.

Robert Collins (lifeless) wrote :

So the implementation issue here is that attrs affect the test_id, and any assumptions we make here will then have to be propogated out to all test backends everywhere, in all languages. As such I'm pretty wary about that.

Robert Collins (lifeless) wrote :

relatedly, the ability to run a test depends on the ability to pass its test id to the backend, so if we mangle it in testr it would make backends have to use heuristics, which is likely going to lead to the wrong test being run. OTOH the timing database is itself just a scheduling hint, so its not a big issue if its wrong. But that said, since its not a big issue if its wrong, why do you care?

Matthew Treinish (treinish) wrote :

Yeah I figured the different language runners and/or backends would be the issue here. I've never run this with anything besides python so I tend to have a biased view. Maybe if we the attr regex configurable instead? (I just realized I probably should add this to subunit2sql, sigh...)

I only care because I'm going to some effort to enable doing a 'testr load' with a subunit stream I generate with sql2subunit --average (which doesn't contain any attrs because it's from an aggregate view) in the openstack ci system. Since the test_id's in that stream don't have any attrs there isn't much point in loading the stream since no scheduler hints get used. (which was the whole point of the exercise)

I have other ways to workaround this (like taking a number of samples from the subunit2sql db and loading those streams instead of using the aggregate view) if this is a no-go, but it seemed pretty simple to address this in testrepository. (The long term fix is to have a sql repository type so this just works, but in the meantime)

Robert Collins (lifeless) wrote :

Ok, so the tie-together is this:

- testr is a runner runner
- it's interface to run tests is limited to test ids
- its interface to /filter/ tests is limited to data included in test enumeration

So to fix this:
A - get subunit.run to include the attrs in enumeration in some fashion
B - teach testr to filter based on attrs as well as ids
C - stop mangling the test ids in subunit.run

-> the change here becomes just B

Changed in testrepository:
status: New → Triaged
importance: Undecided → Wishlist
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers