tracker-extract crashes when trying to parse CRW file, and retries ad nauseam

Bug #1893042 reported by Joël Bourgault
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tracker-miners (Ubuntu)
New
Undecided
Unassigned

Bug Description

My computer hanged yesterday. After analysis, `/` is full, and all this is due to `/var/log/syslog` and `/var/log/kern.log` being full of messages from `tracker-extract`, until these logs get very large (3Go each !), eventually preventing my computer to respond: I was even unable to log in a TTY, and had to kill the computer using the power button (ugh).

So, the message stack is the following, and is repeated every few seconds, again and again and again:

```
$ journalctl -f _COMM=tracker-extract
août 26 13:25:15 joel-T470s tracker-extract[11657]: If the error above is recurrent for the same item/ID, consider running "tracker-extract" in the terminal with the TRACKER_VERBOSITY=3 environment variable, and filing a bug with the additional information
août 26 13:25:27 joel-T470s tracker-extract[11680]: Set scheduler policy to SCHED_IDLE
août 26 13:25:27 joel-T470s tracker-extract[11680]: Setting priority nice level to 19
août 26 13:25:27 joel-T470s tracker-extract[11680]: Could not insert metadata for item "file:///home/joel/Images/2010/2010-02-00%20-%20Tignes/CRW_5845.CRW": Parser error at byte 1991: Expected ObjectList
août 26 13:25:27 joel-T470s tracker-extract[11680]: If the error above is recurrent for the same item/ID, consider running "tracker-extract" in the terminal with the TRACKER_VERBOSITY=3 environment variable, and filing a bug with the additional information
août 26 13:25:39 joel-T470s tracker-extract[11699]: Set scheduler policy to SCHED_IDLE
août 26 13:25:39 joel-T470s tracker-extract[11699]: Setting priority nice level to 19
août 26 13:25:39 joel-T470s tracker-extract[11699]: Could not insert metadata for item "file:///home/joel/Images/2010/2010-02-00%20-%20Tignes/CRW_5845.CRW": Parser error at byte 1991: Expected ObjectList
août 26 13:25:39 joel-T470s tracker-extract[11699]: If the error above is recurrent for the same item/ID, consider running "tracker-extract" in the terminal with the TRACKER_VERBOSITY=3 environment variable, and filing a bug with the additional information
(...)
```

So, following the guideline, I launched the command on the said file, but nothing really interesting arose:

```
$ cd /home/joel/Images/2010/2010-02-00\ Tignes
$ export TRACKER_VERBOSITY=3
$ tracker extract CRW_5845.CRW
Tracker-Message: 08:48:27.050: Set scheduler policy to SCHED_IDLE
Tracker-Message: 08:48:27.050: Setting priority nice level to 19
Tracker-Message: 08:48:27.057: Starting tracker-extract 2.3.3
Tracker-Message: 08:48:27.057: General options:
Tracker-Message: 08:48:27.057: Verbosity ............................ 0
Tracker-Message: 08:48:27.057: Sched Idle ........................... 1
Tracker-Message: 08:48:27.058: Max bytes (per file) ................. 1048576
Tracker-Message: 08:48:27.058: Set scheduler policy to SCHED_IDLE
Tracker-Message: 08:48:27.058: Setting priority nice level to 19
(tracker-extract:8940): dconf-DEBUG: 08:48:27.058: watch_established: "/org/freedesktop/tracker/extract/" (establishing: 1)
(tracker-extract:8940): GLib-GIO-DEBUG: 08:48:27.063: _g_io_module_get_default: Found default implementation gvfs (GDaemonVfs) for ‘gio-vfs’
Tracker-Message: 08:48:27.063: Loading extractor rules... (/usr/share/tracker-miners/extract-rules)
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.063: Loaded rule '10-abw.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.063: Loaded rule '10-bmp.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.063: Loaded rule '10-comics.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.063: Loaded rule '10-desktop.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.063: Loaded rule '10-dvi.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.063: Loaded rule '10-ebooks.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.063: Loaded rule '10-epub.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.063: Loaded rule '10-flac.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-gif.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-html.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-ico.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-jpeg.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-mp3.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-msoffice.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-oasis.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-pdf.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-png.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-ps.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-raw.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-svg.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-tiff.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-vorbis.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-xmp.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '10-xps.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '11-msoffice-xml.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '15-gstreamer-guess.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.064: Loaded rule '15-playlist.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.065: Loaded rule '15-source-code.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.065: Loaded rule '90-gstreamer-audio-generic.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.065: Loaded rule '90-gstreamer-image-generic.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.065: Loaded rule '90-gstreamer-video-generic.rule'
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.065: Loaded rule '90-text-generic.rule'
Tracker-Message: 08:48:27.065: Extractor rules loaded
Tracker-Message: 08:48:27.065: MIME type guessed as 'image/x-canon-crw' (from GIO)
(tracker-extract:8940): Tracker-DEBUG: 08:48:27.067: Using /usr/lib/x86_64-linux-gnu/tracker-miners-2.0/extract-modules/libextract-raw.so...
@prefix nie: <http://www.semanticdesktop.org/ontologies/2007/01/19/nie#> .
@prefix nmm: <http://www.tracker-project.org/temp/nmm#> .
@prefix nfo: <http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<urn:equipment:Canon:Canon%20EOS%20300D%20DIGITAL:> nfo:model "Canon EOS 300D DIGITAL" ;
  a nfo:Equipment ;
  nfo:manufacturer "Canon" .

<file:///home/joel/Images/2010/2010-02-00%20-%20Tignes/CRW_5845.CRW> a nfo:Image , nmm:Photo ;
  nfo:orientation nfo:orientation-top ;
  nmm:exposureTime inf ;
  nmm:fnumber 22.627416 ;
  nfo:width 3072 ;
  nie:contentCreated "2010-02-06T10:11:40+0100" ;
  nfo:equipment <urn:equipment:Canon:Canon%20EOS%20300D%20DIGITAL:> ;
  nfo:height 2048 .

```

So, for the moment I just changed the `journalctl` settings, so that the logs do not get too big too quickly.

The root cause seems to be that `tracker-extract` crashes on a file, and do not remember that it did fail the last time it tried to parse this file, so it retries each time it starts.

So: what should I do now, to prevent this behavior?

Revision history for this message
Joël Bourgault (joel-bourgault) wrote :
Revision history for this message
Joël Bourgault (joel-bourgault) wrote :

additional information about my computer:

```
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.1 LTS
Release: 20.04
Codename: focal

$ uname -a
Linux joel-T470s 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ tracker --version
Tracker 2.3.4

```

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.