thunar segfault, memory corruption in the gslice magazine allocator

Bug #1372140 reported by superkuh
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
thunar (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

A couple weeks back I did a dist-upgrade from Xubuntu 12.04 to 14.04. My preferred desktop environment has always been gnome2 so these days I use the MATE desktop environment on top of Xubuntu. I still use Thunar though.

Ever since I upgraded to 14.04.1 LTS I have been having very regular crashes of any file manager that uses glib2.0. I noticed it first with Caja so I figured it was a MATE issue. I even went so far as too submit a bug to the Ubuntu MATE devs on launchpad: https://bugs.launchpad.net/ubuntu-mate/+bug/1369331 . But it is not just Caja, it also effects Thunar in exactly the same way.

thunar:
  Installed: 1.6.3-1ubuntu5
  Candidate: 1.6.3-1ubuntu5
  Version table:
 *** 1.6.3-1ubuntu5 0
        500 http://us.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages
        100 /var/lib/dpkg/status

The specific Thunar segfault results from memory corruption in the gslice magazine allocator. That doesn't mean it is a glib bug, really, as the programs are probably still at fault. The backtraces I get for every crash all have,

Program received signal SIGSEGV, Segmentation fault.
magazine_chain_pop_head (magazine_chunks=0x555555813eb0) at /build/buildd/glib2.0-2.40.0/./glib/gslice.c:539 539 /build/buildd/glib2.0-2.40.0/./glib/gslice.c: No such file or directory.
#0 magazine_chain_pop_head (magazine_chunks=0x555555813eb0) at /build/buildd/glib2.0-2.40.0/./glib/gslice.c:539
#1 thread_memory_magazine1_alloc (tmem=, ix=1) at /build/buildd/glib2.0-2.40.0/./glib/gslice.c:842 #2 g_slice_alloc (mem_size=mem_size@entry=24) at /build/buildd/glib2.0-2.40.0/./glib/gslice.c:998
#3 0x00007ffff4bbdfd3 in g_string_sized_new (dfl_size=dfl_size@entry=2) at /build/buildd/glib2.0-2.40.0/./glib/gstring.c:121 ...

.. as the source of the problem. The full backtrace for Thunar is here (and attached as a file):
http://pastebin.com/UiXC1LcX

These segfaults happen at random time intervals but usually can be counted on to segfault once every 3 hours or so. Sometimes it can be as often as every tens of minutes.

Sep 16 18:14:18 localhost kernel: [317606.465729] thunar[7798]: segfault at 12f112bb ip 00007f3141623297 sp 00007fff8e6d6790 error 4 in libglib-2.0.so.0.4000.0[7f31415bf000+106000]
Sep 16 18:18:43 localhost whoopsie[1277]: Parsing /var/crash/_usr_bin_thunar.1000.crash.
Sep 16 18:18:43 localhost whoopsie[1277]: Uploading /var/crash/_usr_bin_thunar.1000.crash.
Sep 16 18:18:44 localhost whoopsie[1277]: Sent; server replied with: No error
Sep 16 18:18:44 localhost whoopsie[1277]: Response code: 200

I have been trying to get more exact information about the memory corruption in the gslice magazine allocator by using Valgrind. If I just start Thunar with Valgrind it'll finish it's thing and stop watching almost instantly before a crash can occur. So I've been trying to use it with gdb to wait until the segfault,

<terminal1>
$ echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
$ gdb -p `pidof thunar`
(gdb) set logging file ~/oncemoreintothebreach.txt
(gdb) set logging on

<terminal2>
$ G_DEBUG=fatal-criticals G_SLICE=always-malloc G_DEBUG=gc-friendly valgrind --track-origins=yes --vgdb=yes --vgdb-error=0 -v --tool=memcheck --leak-check=no --num-callers=40 --log-file=valgrind.log --trace-children=yes $(which thunar)

<terminal1>
(gdb) target remote | vgdb
(gdb) continue
... wait for the crash ...
(gdb) backtrace

But so far I've been unsuccessful in gathering anything in valgrind.log that would be useful. Does anyone have any hints for exploring this issue with Valgrind? Or perhaps what I can do to fix this? Or links to other reports similar to mine?

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: thunar 1.6.3-1ubuntu5
ProcVersionSignature: Ubuntu 3.13.0-35.62-generic 3.13.11.6
Uname: Linux 3.13.0-35-generic x86_64
NonfreeKernelModules: fglrx
ApportVersion: 2.14.1-0ubuntu3.4
Architecture: amd64
Date: Sun Sep 21 12:58:00 2014
InstallationDate: Installed on 2013-07-20 (428 days ago)
InstallationMedia: Xubuntu 12.04.2 LTS "Precise Pangolin" - Release amd64 (20130213)
SourcePackage: thunar
UpgradeStatus: Upgraded to trusty on 2014-09-02 (19 days ago)

Revision history for this message
superkuh (n-ubunt1-u) wrote :
Revision history for this message
superkuh (n-ubunt1-u) wrote :

libgtk2.0-0:
      Installed: 2.24.23-0ubuntu1.1
      Candidate: 2.24.23-0ubuntu1.1
      Version table:
     *** 2.24.23-0ubuntu1.1 0
            500 http://us.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages
            100 /var/lib/dpkg/status
         2.24.23-0ubuntu1 0
            500 http://us.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages

Revision history for this message
superkuh (n-ubunt1-u) wrote :

Here's another backtrace I made after installing more debug symbol packages to full cover Thunar's dependencies.

Revision history for this message
superkuh (n-ubunt1-u) wrote :

This crash looks very similar to what is fixed in, https://bugs.launchpad.net/ubuntu/+source/gtk+2.0/+bug/1316509 . But I have that version of libgtk2 and it is not fixed.

Revision history for this message
superkuh (n-ubunt1-u) wrote :

Er, more specifically this bug, https://bugs.launchpad.net/ubuntu/+source/thunar/+bug/1203296 , which was supposed to be fixed by the above bug fix.

Revision history for this message
Alistair Buxton (a-j-buxton) wrote :

Well if you still get the crash then it is something different. The dangling signal crash had a specific way to cause it - it wouldn't crash until you quit thunar - and we were able to verify it was fixed. But memory corruption can be caused by any number of things and they all look the same in the backtrace because it doesn't affect operation until much later.

So, with that in mind, I have some questions for you.

When does it crash? At startup? At exit? When you click a particular thing? Or does it crash even if thunar is just idle in the background?

Does it matter what type of filesystem you're browsing? Do you regularly mount and unmount drives? Do you have weird stuff like super long filenames with unusual characters or extremely deep directory trees?

In short, is there any pattern at all? For the longest time we thought there was no pattern to that other bug, but then one day someone found one and we ended up fixing it within a day.

As for valgrind and thunar, notice that thunar runs a daemon process. Then whenever you run another copy it messages the daemon to open a new window and exits. This can make it difficult to debug. The way I avoid this is to rename the thunar executable and move it out of the system path. That way it can never create extra copies of itself - the one I run will always be the only one. This is a bit invasive for normal use though, so it is better to find a way to reproduce before you start trying to debug. Valgrind logs aren't really very much more helpful than the backtrace when it comes to memory corruption.

Revision history for this message
superkuh (n-ubunt1-u) wrote :

Thanks for the response. I understand memory corruption bugs all look fairly generic in terms of their error messages but linking this issue to similar bugs on launchpad is the only way I can get anyone to pay attention.

It crashes randomly. I have not been able to find any correlation with any behavior of my part. It crashes while Thunar is idling in the background and I am using another computer entirely.

The filesystems don't seem to matter. It crashes when the only thing I have open is my home directory. Or when the only thing I have open is the root of the filesystem. I do have NFS shares mounted most of the time but when I tried not mounting these share the same crashes occur.

I have been trying to find a timing pattern to the bug for weeks now. There is none.

To give you an idea of the frequency of the crashes while Thunar idles looking only at an empty directory in my ~/, see the syslog excerpts from tonight:

Sep 24 19:14:20 localhost kernel: [83443.846282] gmain[8368]: segfault at 4f983fa ip 00007f4d7aab3297 sp 00007f4d738e7a10 error 4 in libglib-2.0.so.0.4000.0[7f4d7aa4f000+106000]
Sep 24 19:17:32 localhost whoopsie[1292]: Parsing /var/crash/_usr_bin_thunar.1000.crash.
Sep 24 19:17:32 localhost whoopsie[1292]: Uploading /var/crash/_usr_bin_thunar.1000.crash.
Sep 24 19:17:33 localhost whoopsie[1292]: Sent; server replied with: No error
Sep 24 19:17:33 localhost whoopsie[1292]: Response code: 200
Sep 24 19:17:33 localhost whoopsie[1292]: Reported OOPS ID 57598c5e-4449-11e4-99c2-fa163e4aaad4
Sep 24 19:17:47 localhost whoopsie[1292]: Sent; server replied with: No error
Sep 24 19:17:47 localhost whoopsie[1292]: Response code: 200
Sep 24 19:27:01 localhost kernel: [84204.624033] thunar[8529]: segfault at 50580df ip 00007f54a1a87297 sp 00007fffff43b910 error 4 in libglib-2.0.so.0.4000.0[7f54a1a23000+106000]
Sep 24 19:39:44 localhost kernel: [84967.753306] thunar[14061]: segfault at 5114161 ip 00007ff46c66796d sp 00007fffab414dd0 error 4 in libglib-2.0.so.0.4000.0[7ff46c604000+106000]
Sep 24 19:44:48 localhost kernel: [85271.116743] gmain[14476]: segfault at 514d8d1 ip 00007ffdb7069297 sp 00007ffdafe9da40 error 4 in libglib-2.0.so.0.4000.0[7ffdb7005000+106000]
Sep 24 19:49:21 localhost kernel: [85544.399031] gmain[14637]: segfault at 519bd0c ip 00007f2a62d91297 sp 00007f2a5bbc5a40 error 4 in libglib-2.0.so.0.4000.0[7f2a62d2d000+106000]
Sep 24 20:11:23 localhost kernel: [86865.335255] thunar[14935]: segfault at 52e2c04 ip 00007f31cb165297 sp 00007fff398f4420 error 4 in libglib-2.0.so.0.4000.0[7f31cb101000+106000]
Sep 24 20:14:00 localhost kernel: [87021.886465] gmain[15460]: segfault at 5302c8b ip 00007f3524e31297 sp 00007f351dc65a10 error 4 in libglib-2.0.so.0.4000.0[7f3524dcd000+106000]
Sep 24 20:19:22 localhost kernel: [87344.224063] thunar[15628]: segfault at 53589d9 ip 00007fd08c73897d sp 00007fffb53e3120 error 4 in libglib-2.0.so.0.4000.0[7fd08c6d5000+106000]

Revision history for this message
superkuh (n-ubunt1-u) wrote :

As an aside, I have run memtest86 multiple times. My RAM is fine. No applications except Thunar (gmain) and Caja crash like this.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in thunar (Ubuntu):
status: New → Confirmed
Norbert (nrbrtx)
Changed in thunar (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
superkuh (n-ubunt1-u) wrote :

I don't know when, but this bug was eventually fixed. I still use my Ubuntu 14.04 system on Canonical's ESM extension and it's been years since I've had Caja or Thunar crash like above. I consider this bug fixed. You can mark it closed.

Norbert (nrbrtx)
tags: removed: trusty
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for thunar (Ubuntu) because there has been no activity for 60 days.]

Changed in thunar (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.