client-side duplicate searching not done in some cases

Bug #103083 reported by Bruce Cowan
2
Affects Status Importance Assigned to Milestone
apport (Ubuntu)
Fix Released
Wishlist
Martin Pitt

Bug Description

Binary package hint: apport

A large number of bugs are reported by apport which are duplicates of others. Some kind of duplicate detection system would be very useful.

The duplicate search should be carried out before the bug and the large amount of data that apport uploads to the bug tracker is sent to prevent bandwidth wastage.

Revision history for this message
Martin Pitt (pitti) wrote :

Apport already supports bug patterns, but they have to be created manually by the developers: https://lists.ubuntu.com/archives/ubuntu-devel/2006-October/021502.html

Malone already searches for similar bug titles and proposes them. However, many users seem to ignore that, so it seems we have to make Malone much more insistive about that. This is already in planning.

Changed in apport:
status: Unconfirmed → Rejected
Revision history for this message
Bruce Cowan (bruce89-deactivatedaccount) wrote : Re: [Bug 103083] Re: Duplicate searching needed

On Thu, 2007-04-05 at 07:17 +0000, Martin Pitt wrote:

> Malone already searches for similar bug titles and proposes them.
> However, many users seem to ignore that, so it seems we have to make
> Malone much more insistive about that. This is already in planning.

I understand this situation, this notification is better than nothing,
but as you said, people ignore it.

My idea was to have this list of bugs displayed before the crash report
is uploaded, as this takes a lot of time and bandwidth for the user and
also launchpad itself.
--
Bruce Cowan <email address hidden>

Revision history for this message
Martin Pitt (pitti) wrote : Re: Duplicate searching needed

Erk, I didn't mean to reject this bug, sorry

Changed in apport:
status: Rejected → Confirmed
Revision history for this message
Bruce Cowan (bruce89-deactivatedaccount) wrote :

Here is a real life story.

I noticed that Evolution crashed when I tried to reply to a mailing list message. Keen to do my bit, I searched Malone for "mailing list", which found nothing similar to my bug. When Evolution crashed the next day (Why does apport only notify about an application crashing the first time in a session), I submitted a bug report via apport. This involved uploading rather a large number of megabytes of data to Malone, this took quite long on a 256Kb upload line. Once Epiphany appeared, I changed the name of the bug report to something more descriptive ([apport] Evolution crashes after trying to reply to mailing list). Thanks to this, the duplicate search didn't find the often-duplicated bug #85159. I proceeded with the report, which became bug #99415, which became a duplicate of #85159 soon afterwords. This prompted me to file this request, and to apologise for filing another duplicate of #85159 (one of 28).

2 lessons should be learnt from this story:

1. It is unreasonable to ask that users should upload vast amounts of data to Malone without any assurance that the resulting bug is a duplicate.
2. Somebody might modify their bug's name, causing the duplicate suggestion to not work.

Revision history for this message
Martin Pitt (pitti) wrote :

The current gutsy apport searches for duplicates on the server side and automatically closes bugs.

We still do not yet have tools to make this possible on the client side, though. This would require reliably getting at least function names on the client side, which the current toolchain does not provide for efficiency and space reasons. For the cases where we do happen to have the function names, it is possible, though.

Changed in apport:
importance: Undecided → Wishlist
status: Confirmed → Triaged
Revision history for this message
Martin Pitt (pitti) wrote :

In 12.04 we do this on the client-side as well now: http://www.piware.de/2011/11/apport-1-90-client-side-duplicate-checking/

However, I noticed that it has been buggy, I just noticed that we fail to do this when there is an address signature as well as a symbolic signature, and the latter does not have an existing duplicate db entry. I take the liberty to modify this bug accordingly.

summary: - Duplicate searching needed
+ client-side duplicate searching not done in some cases
Changed in apport (Ubuntu):
assignee: nobody → Martin Pitt (pitti)
status: Triaged → In Progress
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.2 KiB)

This bug was fixed in the package apport - 1.95-0ubuntu1

---------------
apport (1.95-0ubuntu1) precise; urgency=low

  [ Martin Pitt ]
  * New upstream release:
    - apport-gtk, apport-kde: When reporting a "system crash", don't say "...
      of this program version", but "...of this type", as we don't show a
      program version in the initial dialog
      (https://wiki.ubuntu.com/ErrorTracker#error) (LP: #961065)
    - problem_report.py, write_mime(): Do not put a key inline if it is bigger
      than 1 kB, to guard against very long lines. (LP: #957326)
    - etc/cron.daily/apport: Do not remove whoopsie's *.upload* stamps every
      day, only if they are older than a week. whoopsie comes with its own
      cron job which deals with them. Thanks Steve Langasek. (LP: #957102)
    - report.py, mark_ignore(): Fix crash if executable went away underneath
      us. (LP: #961410)
    - apport-gtk: Do not compare current continue button label against a
      translated string. Instead just remember whether or not we can restart
      the application. (LP: #960439)
    - hookutils.py, command_output(): Add option to keep the locale instead of
      disabling it.
    - hookutils.py, command_output(): Actually make the "input" parameter
      work, instead of causing an eternal hang. Add tests for all possible
      modes of operation.
    - hooktuils.py: Change root_command_output() and
      attach_root_command_outputs() to disable translated messages
      (LC_MESSAGES=C) only as part of the command to be run, not already for
      the root prefix command. This will keep the latter (gksu, kdesudo, etc.)
      translated. (LP: #961659)
    - apport-gtk: Cut off text values after 4000 characters, as Gtk's TreeView
      does not get along well with huge values. KDE's copes fine, so continue
      to display the complete value there. (LP: #957062)
    - apport-gtk: Make details window resizable in bug reporting mode.
    - crashdb.py, known(): Check the address signature duplicate database if
      the symbolic signature exists, but did not find any result. (LP: #103083)
    - ui.py: Run anonymization after checking for duplicates, to prevent host
      or user names which look like hex numbers to corrupt the stack trace.
      (LP: #953104)
    - apport-gtk: Require an application to both have TERM and SHELL in its
      environment to consider it a command line application that was started
      by the user. (LP: #962130)
    - backends/packaging-apt-dpkg.py, _check_files_md5(): Fix double encoding,
      which caused UnicodeDecodeErrors on non-ASCII characters in an md5sum
      file. (LP: #953682)
    - apport-kde, apport-gtk: Only show "Relaunch" if the report has a
      ProcCmdline, otherwise we cannot restart it. (LP: #956173)
    - apport-gtk, apport-kde: Show the ExecutablePath while we're collecting
      data for the crash report. Thanks Evan Dandrea. (LP: #938707).
  * debian/copyright: Change to copyright format 1.0.
  * debian/control: Bump Standards-Version to 3.9.3.

  [ Brian Murray ]
  * data/general-hooks/ubuntu.py: use main.log to determine UpgradeStatus not
    apt.log (LP: #886111)
 -- Martin Pitt <<email address hidden>...

Read more...

Changed in apport (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.