Improve gerrit updating of bug status when fix spans multiple commits

Bug #1018013 reported by Daniel Berrange
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Core Infrastructure
Fix Released
Medium
Anthony Dodd

Bug Description

The current script that integrates GIT commits merged from Gerrit, with the bug tracker has sub-optimal behaviour when dealing with fixes that span across multiple commits. When preparing a patch series it is natural to note the bug number in all of the patches in the series which are directly relevant. When updating the bug status to 'Fix committed', however, the Gerrit hooks only consider each patch in isolation. So if 1 single patch out of a series is merged, but the others are not, the bug will get prematurely marked as 'Fix committed'.

The only current way to deal with this is to refrain from mentioning the bug number in all but the very last commit in the patch sequence. This is undesirable though, since the bug number is useful data for the GIT history, and we do want all patches related to the bug to be tracked as a whole.

The current Gerrit bug hook:

https://github.com/openstack/openstack-ci-puppet/blob/master/modules/gerrit/files/scripts/update_bug.py

matches on a regex

   bug_regexp = r'([Bb]ug|[Ll][Pp])[\s#:]*(\d+)'

So anything like

    Bug #NNNN

will cause the bug to be closed once commited.

I would suggest expanding the syntax to allow an (optional) qualifier to be prefixed eg

   Related bug #NNNN
   Resolves bug #NNNN

Any bug qualifier would cause the commit to be associated with the bug, but only the 'resolves' qualifier would cause the bug to be marked as 'Fix commited'. For compat with existing practice & easing of the "simple" non-patch series case, the lack of any qualifier would be treated as meaning 'Resolves'. So only people creating patch series, would need to use 'Resolves' vs 'Related' distinction

A completely different idea, would be to try and identify complete patch series in gerrit, and automatically "do the right thing"
 wit the bug tracker only marking it as resolved once the entire series is committed. I'm not sure how practical that is though, given Gerrits' limited understanding of patch series. eg could it detect this example:

  https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bug/1003373,n,z

where all 5 patches were pushed to gerrit at once ?

Revision history for this message
Monty Taylor (mordred) wrote :

I think adding a related specifier seems reasonable. I don't think we'll be able to get to automagic understanding of related patchsets right away though.

Changed in openstack-ci:
importance: Undecided → Medium
status: New → Triaged
milestone: none → folsom
Revision history for this message
Daniel Berrange (berrange) wrote :

That sounds fine to me

Thierry Carrez (ttx)
tags: added: update-bug
Monty Taylor (mordred)
Changed in openstack-ci:
milestone: folsom → grizzly
assignee: nobody → Thierry Carrez (ttx)
Thierry Carrez (ttx)
tags: added: updatebug
removed: update-bug
Revision history for this message
Thierry Carrez (ttx) wrote :

We discussed at the Summit a whole new topology that would also allow us to drive finer actions on LP from commit messages. See https://etherpad.openstack.org/drive-automation-from-commitmsg for details

Changed in openstack-ci:
assignee: Thierry Carrez (ttx) → nobody
Clark Boylan (cboylan)
Changed in openstack-ci:
milestone: grizzly → havana
tags: added: low-hanging-fruit
Revision history for this message
Anas Alkhatib (anas-alkhatib) wrote :

Hi,

I think this bug description needs a little update, I cloned the openstack/config repository, but could not find the hooks mentioned above.

I'd be interested to follow up on this bug, if it is still applicable.

Thanks,
Anas

Revision history for this message
Jeremy Stanley (fungi) wrote :

In the timespan of this bug, the script has moved to openstack-infra/jeepyb:jeepyb/cmd/update_bug.py ( https://github.com/openstack-infra/jeepyb/blob/master/jeepyb/cmd/update_bug.py ).

Anthony Dodd (thedodd)
Changed in openstack-ci:
assignee: nobody → Anthony Dodd (thedodd)
status: Triaged → In Progress
Anthony Dodd (thedodd)
description: updated
Anthony Dodd (thedodd)
description: updated
Revision history for this message
Anthony Dodd (thedodd) wrote :

For anyone who might happen to know this software better than I, perhaps you could take a look at the following lines of code for me: https://github.com/openstack-infra/jeepyb/blob/master/jeepyb/cmd/update_bug.py#L229.

The 'process_bugtask' function bases its top-most logic off of the value of the 'hook' command-line parameter, which is obviously parsed by argparse in main. My question is, where are the command-line arguments coming from? Obviously they are parsed from sys.argv, which argparse defaults to, but where are the arguments originating from and how could one determine what the typical arguments are going to be?

Answering these few questions will help me to create the appropriate logic for the "related" and "resolves" functionality which is mentioned in the original bug description.

Revision history for this message
James E. Blair (corvus) wrote :

The arguments are described at:

  https://review.openstack.org/Documentation/config-hooks.html

You can see exactly how the hooks are implemented on our systems here:

  https://github.com/openstack-infra/config/blob/master/modules/openstack_project/templates/gerrit_patchset-created.erb

  https://github.com/openstack-infra/config/blob/master/modules/openstack_project/files/gerrit/change-merged

Those files are shell scripts which are installed as the actual hooks; they then call the python scripts in jeepyb.

From my own notes when experimenting with Gerrit, here are the arguments sent to a patchset-created hook:

  --change Ia1fea1eab3976f1a9cb89ceb3ce1c6c6a7e79c42 --change-url https://review-dev.openstack.org/81 --project gtest-org/test --branch master --uploader James E. Blair (<email address hidden>) --commit 05508ae633852469d2fd7786a3d6f1d06f87055b --patchset 1

Revision history for this message
Anthony Dodd (thedodd) wrote :

My plans for this are as follows:
In https://github.com/openstack-infra/jeepyb/blob/master/jeepyb/cmd/update_bug.py in the find_bugs() function, I am going to parse any prefixe associated with a specified bug, using regular expressions, and associate that lp_task with the specified prefix. If no prefix is specified, it will default to 'resolves' as mentioned above in the original bug description. find_bugs() will then return the values of bugtasks dict, as normal, but each bug task will be a key for its associated prefix. Then, in process_bugtask(), we will make the appropriate changes and updates based on the 'hook' parameter and the bugtask's prefix.

I have been using this document <https://etherpad.openstack.org/drive-automation-from-commitmsg> as a reference in order to make any changes made here compatible with future expansion.

There will be a decent bit to flesh out here, so any input and criticism would be nice. I will submit drafts as I proceed so that we can work towards our main goal more quickly.

Revision history for this message
Anthony Dodd (thedodd) wrote :

I've realized that this bug is quite outdated. In the original decription—which is more than a year old—it is stated, "So if 1 single patch out of a series is merged, but the others are not, the bug will get prematurely marked as 'Fix committed'." The reason that I've come to the opinion that there is actually nothing to fix here, is because gerrit will warn the developer that they need to "squash" their commits into one commit using the `--amend` option. As a matter of fact, it is stated quite clearly in our Wiki (https://wiki.openstack.org/wiki/Gerrit_Workflow), under the 'Updating a Change' heading, that one is to use the `--amend` option if multiple changes need to be made.

I would say that this bug no longer describes an extant issue. Thoughts?

Revision history for this message
Clark Boylan (cboylan) wrote :

I think this is still an issue for when there are logically several components that need fixing before the greater bug can be marked as fixed. You don't want to squash a bunch of logically distinct changes together simply to make launchpad happy. Instead you want to mark all of them as work necessary to fix a bug and have the final commit mark that it closes the bug.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to jeepyb (master)

Fix proposed to branch: master
Review: https://review.openstack.org/36886

Revision history for this message
Anthony Dodd (thedodd) wrote :

Jeremy Stanley and I have been discussing the idea of adding a testing framework for jeepyb. I think it's a great idea, especially for update_bug.py and update_blueprint.py. I would like to take the initiative on this, however, I'm unsure as to how to proceed. Should we create some blueprints to describe what our plans are? Perhaps a bug report is sufficient (though it doesn't seem appropriate)? How should we do this?

Revision history for this message
Jeremy Stanley (fungi) wrote :

We use bug reports for non-bug task tracking too, so it's appropriate enough if you wanted to open one (or preferably more... the more focused in scope you can make the bugs the better, so maybe one for each script you're testing). We haven't really made use of blueprints for infra-specific tool improvements as far as I know, but probably wouldn't be opposed if that's easier for you to work with.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to jeepyb (master)

Reviewed: https://review.openstack.org/36886
Committed: http://github.com/openstack-infra/jeepyb/commit/91c791a580efe76deab0f8d8a54ca8f73a2bb7ba
Submitter: Jenkins
Branch: master

commit 91c791a580efe76deab0f8d8a54ca8f73a2bb7ba
Author: Anthony Dodd <email address hidden>
Date: Fri Jul 12 12:47:40 2013 -0500

    Improving Gerrit + commit-log automation.

    I've reworked a few items here, so I will describe them in as much a
    linear fashion as possible. Keep in mind that one of the primary goals
    for these changes is to allow us to "trigger more magic from Gerrit".
    And it is to that end that I've implemented these changes.

    In find_bugs(), I've tightened-up the regular expression being used so
    that it will parse-out any prefixes associated with the bug reference.
    I've tested the regular expression against the most common bug
    references that I've seen in commit logs, as well as against the
    styles described in our documentation. The sources that I've drawn
    from are:

        https://etherpad.openstack.org/drive-automation-from-commitmsg
        https://wiki.openstack.org/wiki/GitCommitMessages
        https://wiki.openstack.org/wiki/Gerrit_Workflow

    Moreover, I'm using re.finditer() which allows for more direct
    access to the text that was matched. Lastly, I've tried to keep the
    expression as flexible as possible so that it will match even if the
    developer references the bug in a funky way.

    In order to keep the prefix and lp_task associated with each other,
    I've created a class called "Task" which is simply an interface to
    determine what sort of automation needs to take place for the given
    bugtask. This being the case, I've taken the liberty of renaming a few
    variables to make this more clear.

    In "Task", a basic level of processing is performed on the prefix to
    determine what changes need to be made on launchpad. A method called
    needs_change() returns a boolean indicating if the supplied argument
    is a change which needs to be made.

    Lastly, yet most importantly for this bug fix, process_bugtsk() is
    utilizing needs_change(), as mentioned above, to ensure that the
    bugtask's status is not erroneously changed in the case of a bug fix
    which spans multiple commits.

    Closes-Bug: 1018013
    Change-Id: Ibd84d3c6edcf104afe3211fb55ea531efa92d20e

Changed in openstack-ci:
status: In Progress → Fix Committed
Khai Do (zaro0508)
Changed in openstack-ci:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.