Comment 2 for bug 1408259

Revision history for this message
James Polley (tchaypo) wrote :

Sorry that it's taken me so long to respond to this. Most of the delay has been caused by me wanting to take the time to write a careful response: I'm very impressed by the elastic-recheck system, and I'm very grateful to the people who implemented it, run it, and have made it so easy to add rules. I'm aware that anything I'm about to be unhappy about is, at worst, a minor docbug on a system that is, on the whole, very successful. I'm worried that I'm going to sound snarky, so I've delayed responding until I had time to write very carefully and try to make sure that I convey the right meaning with my words...

but some of the delay is because I had a wonderful response half-written and then I hit command-q instead of command-w and lost it all. Yay me!

So this is the rushed re-typing of what I can remember of my first response. I'm sorry if it comes across as me being grumpy or unhappy or upset - that's not how I feel, but I have a sense of humour that often interferes with my attempts to talk calmly about things.

"Normally an elastic-recheck query for a fixed bug" - it wasn't fixed at the time I filed the review that would have added the rule. I think I'd identified the problem and had a potential fix up for review - but it wasn't fixed. I got the impression that the elastic-recheck rule would be processed with some priority, so I thought it likely that the rule would land before the fix did - but this turned out to be false.

The existing docs suggest to me that this process should be followed for all bugs with no exceptions, but Matt's response here suggests that in reality we only expect queries to be added for long-running issues with no immediately obvious fixes. Perhaps it's worth adding a note somewhere (perhaps http://docs.openstack.org/infra/manual/developers.html#automated-testing) explaining that it's only worth adding a query for bugs that have no immediately obvious fixes?

" Does check-tripleo-ironic-overcloud-precise-nonha run in the gate queue" I don't believe it does - certainly it doesn't on the tripleo-incubator project. I'm not sure why that's relevant though: http://docs.openstack.org/infra/manual/developers.html#automated-testing says that "If a change fails tests in Jenkins, please follow the steps below:" - if it's only gate checks that matter, perhaps that should say "If a change fails gate tests in Jenkins, please follow the steps below:" ?

"If the bug is fixed and isn't in the gate queue, it's not on the uncategorized bugs list so there isn't a huge reason to add an e-r query for it." There's reason for me. We had many changes across many projects (not just TripleO projects - https://review.openstack.org/#/c/141043/ was a change on Heat, but it ran the TripleO checks, for instance) that had failed this check. I was under the understanding that landing an elastic-recheck query would cause an automatic recheck for all of those changes - simpler (for me) than finding them all and manually requesting a recheck

But in hindsight I'm not sure why I thought that - it seems more logical that the new elastic-recheck rule would be only applied on new changes as they failed, rather than retroactively being applied to old changes.

Even then, if I've understood the above properly, elastic-recheck only actually rechecks if a gate test fails; and since this test is never used in a gate, elastic-recheck would never kick in anyway.

I'd be happy to propose some updates to the docs - I don't know if I'll get them right, but we have review to get feedback and improve them before they land, so I'm happy to take a stab :) I was going to say that I couldn't figure out where the docs were located, but then I had the bright idea of scrolling down and I found the link at the bottom. I'll go ahead and throw up some suggestions on changes to http://docs.openstack.org/infra/manual/developers.html#automated-testing and http://docs.openstack.org/infra/elastic-recheck/readme.html