"Duplicate" system is conceptually erroneous

Bug #52613 reported by Ian Jackson
6
Affects Status Importance Assigned to Milestone
Launchpad itself
Won't Fix
Undecided
Unassigned

Bug Description

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

 affects /products/malone

The current way that LP handles the situation where the same bug was
reported multiple times (ie, disposing of the bug with a special state
`Duplicate [of the other bug]') is conceptually wrong.

A more correct data model for this situation is that provided by the
Debian BTS. Quoting http://www.debian.org/Bugs/server-control:

 merge bugnumber bugnumber ...
     Merges two or more bug reports. When reports are merged opening,
     closing, marking or unmarking as forwarded and reassigning any of
     the bugs to a new package will have an identical effect on all of
     the merged reports.

     Before bugs can be merged they must be in exactly the same state:
     either all open or all closed, with the same forwarded-to upstream
     author address or all not marked as forwarded, all assigned to the
     same package or package(s) (an exact string comparison is done on
     the package to which the bug is assigned), and all of the same
     severity. If they don't start out in the same state you should use
     reassign, reopen and so forth to make sure that they are before
     using merge. Titles are not required to match, and will not be
     affected by the merge. Tags are not required to match, either,
     they will be joined.

     If any of the bugs listed in a merge command is already merged
     with another bug then all the reports merged with any of the ones
     listed will all be merged together. Merger is like equality: it is
     reflexive, transitive and symmetric.

     Merging reports causes a note to appear on each report's logs; on
     the WWW pages this is includes links to the other bugs.

     Merged reports are all expired simultaneously, and only when all
     of the reports each separately meet the criteria for expiry.

 forcemerge bugnumber bugnumber ...
     Forcibly merges two or more bug reports. The first bug listed is
     the master bug, and its settings (the settings which must be equal
     in a normal merge) are assigned to the bugs listed next. To avoid
     typos erroneously merging bugs, bugs must be in the same package.
     See the text above for a description of what merging means.

     Note that this makes it possible to close bugs by merging; you are
     responsible for notifying submitters with an appropriate close
     message if you do this.

 unmerge bugnumber
     Disconnects a bug report from any other reports with which it may
     have been merged. If the report listed is merged with several
     others then they are all left merged with each other; only their
     associations with the bug explicitly named are removed.

     If many bug reports are merged and you wish to split them into two
     separate groups of merged reports you must unmerge each report in
     one of the new groups separately and then merge them into the
     required new group.

     You can only unmerge one report with each unmerge command; if you
     want to disconnect more than one bug simply include several
     unmerge commands in your message.

Ian.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFEs3h28jyP9GfyNQARArO9AJ9+dv8NWjIzoBkO/BxAC1XEKkKVbwCfUSNT
ukm97gsyRXS+pZMbKdTu+vc=
=nE2G
-----END PGP SIGNATURE-----

Tags: lp-bugs
Revision history for this message
Matthew Paul Thomas (mpt) wrote :

The description of forcemerge is quite similar to the behavior I've specified in the DuplicateBugHandling spec (except for the "you are responsible for notifying submitters" part).

Changed in malone:
status: Unconfirmed → Confirmed
Revision history for this message
Ian Jackson (ijackson) wrote : Re: [Bug 52613] Re: "Duplicate" system is conceptually erroneous

Matthew Paul Thomas writes ("[Bug 52613] Re: "Duplicate" system is conceptually erroneous"):
> The description of forcemerge is quite similar to the behavior I've
> specified in the DuplicateBugHandling spec (except for the "you are
> responsible for notifying submitters" part).

There is a very important difference. `forcemerge' has an identical
effect to a series of reassign, reopen, close, etc., followed by
merge. Ie, there is _no master bug_.

After the merge is complete, all of the merged bugs are equal in
status. There is no privileged bug. All of the bugs are merged with
each other and none of them is a duplicate of any other one in
particular.

I'm sorry if I seem to be repeating myself but this, the most
important difference, seems not to have come across.

_There is no privileged `master' bug_.

Ian.

Revision history for this message
Matthew Paul Thomas (mpt) wrote :

And there being no master bug report is good because someone can resolve a bug report by e-mail and it will Do The Right Thing, even if in the meantime someone else discovered that the bug had previously been reported and marked it as such. Is that right, and are there any other reasons?

Please also explain the "Before bugs can be merged they must be in exactly the same state, either all open or all closed" part. Is that solely to make merging commutative? In my experience it's common for duplicates to be reported where the original is resolved fixed, because a version with the fix has not yet been released.

Revision history for this message
Ian Jackson (ijackson) wrote :

Matthew Paul Thomas writes ("[Bug 52613] Re: "Duplicate" system is conceptually erroneous"):
> And there being no master bug report is good because someone can resolve
> a bug report by e-mail and it will Do The Right Thing, even if in the
> meantime someone else discovered that the bug had previously been
> reported and marked it as such. Is that right, and are there any other
> reasons?

It's right because the concept of the master report for a set of
merged bugs doesn't generally correspond to anything important in the
workflow. If you think about the problem clearly, then the concept of
the master report goes away. Your example of fixing the bug is just
one.

For example, obviously all reports which are of the same bug ought to
be in the same state, so it is wrong to show the states of some of
them as a special `duplicate-of-other-bug' state rather than
`confirmed', `fix released' or whatever. It should also be possible
to manipulate the state of a bug via any of its reports.

This is all complicated for Malone by Malone's task concept. I
haven't thought about this aspect clearly but the obvious answer is
that when two bugs are merged the new task list should be the union of
the existing task lists, and then from then on each set of merged bugs
should have one list of tasks shared between all of the reports.

> Please also explain the "Before bugs can be merged they must be in
> exactly the same state, either all open or all closed" part. Is that
> solely to make merging commutative? In my experience it's common for
> duplicates to be reported where the original is resolved fixed, because
> a version with the fix has not yet been released.

That feature of the `merge' command is needed to preserve the
invariant property that all merged bugs are in the same state. It is
obviously wrong for two reports describing the same bug to record it
as being in different states. The bug system should record the actual
state in the real world of the problem and its fix, and that state
exists once overall, not once for each report.

The `forcemerge' command exists precisely for the use case you
mention. The forcemerge command copies the state from the first
report mentioned to the other reports, so that they are in a suitable
state for merger. It's a convenience function.

Note that with a merge or forcemerge command both the first and
subsequent bugs mentioned may themselves already be merged with
others. With forcemerge the state of the first bug listed will be
override the prior state of all bugs which start out merged with any
of the later ones, so that the invariant is preserved.

Ian.

Revision history for this message
Matthew Paul Thomas (mpt) wrote :

Hmmm, words like "clearly" and "obviously" tend to be anti-persuasive for me. The various bug tracking systems used by most other Free Software projects don't work the way you're proposing; so while you may be right, it's not clear or obvious.

As I see it, the "anything important in the workflow" generated by the duplicate system is:
(a) a single live discussion of the bug (with discussion in any duplicates stopping, except for the topic of whether it's really a duplicate); and
(b) a single live description of the bug, which is expanded/narrowed/clarified over time (using information from duplicates, if applicable) as the bug becomes better understood. (Granted that the UI for editing the description is clunky right now.)
The duplicate system practically guarantees that everyone involved will be looking at the same bug report, which is a benefit if there are either economies of scale or network effects in more people studying any bug report.

With the merge system there would also be either an amusing disparity between the number of open bugs and the number of reports in a listing of those bugs, or a semi-arbitrary decision about which of a set of equivalent bug reports should be shown in a search listing.

I do agree with you that Malone's "Affects" lines wouldn't present any barrier to a forcemerge-based system -- with a few extra heuristics, like "In Progress" always trumping "Confirmed" regardless of which report was earliest.

Revision history for this message
Christian Reis (kiko) wrote :

Note that if we don't merge, we can still make email status modifications to the bug DTRT by, if the bug is a duplicate, operating on the master bug instead.

We could also either:

  - Avoid displaying status for duplicate bugs
  - Display the status for the master bug in the duplicate bugs, marking it specially

  Status (from master bug 12):
   .---------------------------------
   | Affects |

I think I prefer the former, because a person visiting the dupe will be rare enough that it's worth avoiding the confusion.

Revision history for this message
Brad Bollenbach (bradb) wrote :

Here are some ideas to think about for improving our duplicate system:

* Replace the "duplicate of bug _NNN_" notification with a more informative.

    This bug has been reported _twice_.
    This bug has been reported _12 times_.

The link goes to a page listing all the other reports of this bug. We can then remove the duplicates portlet.

* Kick it up a notch.

    This bug has been reported _5 times_. The fix is being discussed primarily in _bug 34123_.

The primary discussion is assumed to occur in the bug that somebody first pointed at saying "this bug is that same as that one".

* All occurrences of the same bug report (i.e. the duplicate ring) should share the same text index, but duplicates should show in listings only when doing a search for duplicate bugs (i.e. guided filebug.)

* When merging a bug, say:

*Which table best reflects the current status of the bug?* You can tweak the status after this step, if needed. If you're unsure, just click "Continue".

(*) [Status table of target]
( ) [Status table of duplicate]
( ) [Status table of target plus only /new/ rows from the duplicate, or not shown at all, if same as first option]

[Continue]

The main bug is updated accordingly. All other bugs' statuses in the duplicate ring remain unchanged. This makes it easy to revert.

* Then show the table of the master bug on all other reports of this bug:

    This bug has been reported _5 times_. The fix is being discussed in _bug 34123_. Any changes to the status below will be reflected in the other reports of this bug.

[ The normal affects table ]

(The notification message can be tweaked slightly when the current bug /is/ the main bug.)

Of course, the "will be reflected in other reports" thing is nothing magical; we're just showing the bugtasks and nominations table from the main bug everywhere. If any bug is ever unduped, it magically reverts to what it looked like before.

I think these suggestions offer an interesting balance between being relatively easy to implement (except, possibly, the textindex thing), and making our duplicate model easier to understand and use.

Thoughts?

Revision history for this message
Matthew Paul Thomas (mpt) wrote :

I like the idea of getting rid of "Duplicates of this bug" box behind a "This bug has been reported N times" expander. The rest of those suggestions seem rather confusing, though -- particularly the last one, which would make it more difficult to determine whether a duplicate bug marking was in error.

Revision history for this message
Christian Reis (kiko) wrote :

We're not reconsidering duplicates for 2.0. Please split this into separate specific bugs if this still needs addressing.

Changed in malone:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related blueprints

Remote bug watches

Bug watches keep track of this bug in other bug trackers.