Oneiric LEB: Unity fails to start with a segfault at libnux

Bug #880486 reported by Ricardo Salveti
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro Ubuntu
Fix Released
Critical
Ricardo Salveti
Nux
Fix Released
Undecided
Unassigned
Unity
Fix Released
Undecided
Unassigned
Unity GLES port
Fix Released
Critical
Travis Watkins
nux (Ubuntu)
Fix Released
Undecided
Unassigned
unity (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Hwpack: http://snapshots.linaro.org/oneiric/lt-panda-x11-base-oneiric/20111020/1/images/hwpack/hwpack_linaro-lt-panda-x11-base_20111020-1_armel_supported.tar.gz
Image: http://snapshots.linaro.org/oneiric/linaro-o-ubuntu-desktop/20111023/0/images/tar/linaro-o-ubuntu-desktop-tar-20111023-0.tar.gz

Desktop will fail to start, but you can reproduce it at the console, like:

$ su - linaro
$ export DISPLAY=:0.0
$ gdb compiz
GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabi".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /usr/bin/compiz...(no debugging symbols found)...done.
(gdb) r --replace composite opengl unityshell
Starting program: /usr/bin/compiz --replace composite opengl unityshell
[Thread debugging using libthread_db enabled]
PVR:(Warning): InitContext: ignoring buffer type CBUF_TYPE_PDS_VERT_SECONDARY_PREGEN_BUFFER [778, /eglglue.c]
[New Thread 0x446bd2b0 (LWP 2786)]
[New Thread 0x4561e2b0 (LWP 2788)]
[New Thread 0x469f52b0 (LWP 2789)]
WARN 2011-10-23 17:53:43 glib.glib-gobject <unknown>:0 invalid (NULL) pointer instance
Xlib: extension "XINERAMA" missing on display ":0.0".

Program received signal SIGSEGV, Segmentation fault.
0x41313890 in nux::XInputWindow::SetStruts() () from /usr/lib/libnux-graphics-1.0.so.0
(gdb) bt full
#0 0x41313890 in nux::XInputWindow::SetStruts() () from /usr/lib/libnux-graphics-1.0.so.0
No symbol table info available.
#1 0x41137136 in UnityScreen::initLauncher(nux::NThread*, void*) () from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#2 0x411375d2 in UnityScreen::initUnity(nux::NThread*, void*) () from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#3 0x41297b8a in nux::WindowThread::Run(void*) () from /usr/lib/libnux-1.0.so.0
No symbol table info available.
#4 0x41133d12 in UnityScreen::UnityScreen(CompScreen*) () from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#5 0x4113ab92 in PluginClassHandler<UnityScreen, CompScreen, 0>::get(CompScreen*) () from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#6 0x4113592e in UnityWindow::UnityWindow(CompWindow*) () from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#7 0x4113ad64 in PluginClassHandler<UnityWindow, CompWindow, 0>::get(CompWindow*) () from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#8 0x4113adfc in CompPlugin::VTableForScreenAndWindow<UnityScreen, UnityWindow>::initWindow(CompWindow*) () from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#9 0x00049a48 in CompPlugin::windowInitPlugins(CompWindow*) ()
No symbol table info available.
#10 0x0003e758 in CompWindow::CompWindow(unsigned long, XWindowAttributes&, PrivateWindow*) ()
No symbol table info available.
#11 0x0003e9fa in CoreWindow::manage(unsigned long, XWindowAttributes&) ()
No symbol table info available.
#12 0x0003209e in CompScreen::init(char const*) ()
No symbol table info available.
#13 0x0002835a in CompManager::init() ()
No symbol table info available.
#14 0x00025d4a in main ()
No symbol table info available.
(gdb) quit

I'm generating the dbg package for nux and will get a better trace later.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

Packages:
nux-tools 1.16.0.2011.10-0linaro1
libnux-1.0-0 1.16.0.2011.10-0linaro1
libnux-1.0-common 1.16.0.2011.10-0linaro1
unity 4.24.0.2011.10-0linaro1
compiz 1:0.9.6+bzr20110929.2011.10-0linaro2

Changed in linaro-ubuntu:
status: New → Confirmed
Changed in unity-gles:
status: New → Confirmed
Changed in linaro-ubuntu:
importance: Undecided → Critical
Changed in unity-gles:
importance: Undecided → Critical
Changed in linaro-ubuntu:
milestone: none → 11.10
Changed in unity-gles:
assignee: nobody → Travis Watkins (amaranth)
Changed in linaro-ubuntu:
assignee: nobody → Ricardo Salveti (rsalveti)
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

A better trace:
Program received signal SIGSEGV, Segmentation fault.
nux::XInputWindow::SetStruts (this=0x6265c0) at ./XInputWindow.cpp:131
131 tmp_rect.height = info[i].height;
(gdb) bt full
#0 nux::XInputWindow::SetStruts (this=0x6265c0) at ./XInputWindow.cpp:131
        i = <optimized out>
        total_screen_region = 0x5e7a50
        info = 0x0
        screen_region = <optimized out>
        intersection = 0x6267a8
        tmp_rect = {x = 100, y = 100, width = 320, height = 200}
        largestHeight = <optimized out>
        screenWidth = <optimized out>
        screenHeight = <optimized out>
        data = {0 <repeats 12 times>}
        n_info = 1092141056
        input_window_region = 0x62e268
        largestWidth = <optimized out>
#1 0x41139136 in UnityScreen::initLauncher(nux::NThread*, void*) ()
   from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#2 0x411395d2 in UnityScreen::initUnity(nux::NThread*, void*) ()
   from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#3 0x41299b8a in nux::WindowThread::Run (this=0x1c6c08, arg=<optimized out>)
    at ./WindowThread.cpp:840
No locals.
#4 0x41135d12 in UnityScreen::UnityScreen(CompScreen*) ()
   from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#5 0x4113cb92 in PluginClassHandler<UnityScreen, CompScreen, 0>::get(CompScreen*) () from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#6 0x4113792e in UnityWindow::UnityWindow(CompWindow*) ()
   from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#7 0x4113cd64 in PluginClassHandler<UnityWindow, CompWindow, 0>::get(CompWindow*) () from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#8 0x4113cdfc in CompPlugin::VTableForScreenAndWindow<UnityScreen, UnityWindow>::initWindow(CompWindow*) () from /usr/lib/compiz/libunityshell.so
No symbol table info available.
#9 0x00049a48 in CompPlugin::windowInitPlugins(CompWindow*) ()
No symbol table info available.
#10 0x0003e758 in CompWindow::CompWindow(unsigned long, XWindowAttributes&, PrivateWindow*) ()
No symbol table info available.
#11 0x0003e9fa in CoreWindow::manage(unsigned long, XWindowAttributes&) ()
No symbol table info available.
#12 0x0003209e in CompScreen::init(char const*) ()
No symbol table info available.
#13 0x0002835a in CompManager::init() ()
No symbol table info available.
#14 0x00025d4a in main ()
No symbol table info available.
[Current thread is 1 (Thread 0x40cb6a10 (LWP 27458))]
(gdb) l
126 for (int i = 0; i < n_info; i++)
127 {
128 tmp_rect.x = info[i].x_org;
129 tmp_rect.y = info[i].y_org;
130 tmp_rect.width = info[i].width;
131 tmp_rect.height = info[i].height;
132
133 screen_region = XCreateRegion ();
134
135 XUnionRectWithRegion (&tmp_rect, screen_region, screen_region);

Avik Sil (aviksil)
tags: added: linaro-ubuntu lt-panda unity-3d
tags: added: hwgfx
Revision history for this message
Ricardo Salveti (rsalveti) wrote :

The crash happens because there's no XINERAMA support at the X server, and nux expect it to be always enabled.

"./NuxGraphics/XInputWindow.cpp" line 107

    XineramaScreenInfo *info = XineramaQueryScreens (display_, &n_info);

Instead of checking if info is NULL (xinerama disabled), it goes checking the struct content, and it then explodes.

First it would be good to understand why xinerama is not enabled by default, and then changing nux to take care of such situation.

Revision history for this message
Jesse Barker (jesse-barker) wrote :

On a platform where you would never encounter more than one GPU/pipe (like an SoC with no option for additional plug-in graphics cards), I would never expect to see Xinerama enabled.

Ricardo is quite right, though. Failure to check the return value on the query screens call before dereferencing is _BAD_. Does nux actually require something from Xinerama, or does it just assume its existence?

Revision history for this message
Jesse Barker (jesse-barker) wrote :

I suspect that uninitialized variable "n_info" stays uninitialized when XineramaQueryScreens fails, which causes the bad dereference. Variable "monitor" needs a default initialization (for that matter, it probably doesn't need to be a Xinerama type as the "screen_number" field is not used in this context), and "n_info" can be forced to 0 when info==NULL, which should allow for the rest of the Xinerama-agnostic code to function normally and properly.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

The manual says that XineramaQueryScreens returns NULL and sets number to 0 if Xinerama is not active. Checking with gdb the value n_info was 1092186112, clearly wrong.

I'm changing and building the code locally to check the return code properly and will post my test results when I'm done.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

I just wonder why Travis didn't find this issue while he was testing on his panda...

Could be that he validated the work at his own desktop, but building for GLES.

Revision history for this message
Jesse Barker (jesse-barker) wrote :

That's the behavior if the extension is present but not active. If it is simply not present, it just returns NULL and generates the warning string:

Xlib: extension "XINERAMA" missing on display ":0.0".

At any rate, the fix will likely be the same. If Xinerama is either not there or not enabled, then there is only one screen whose geometry matters.

Revision history for this message
Ricardo Salveti (rsalveti) wrote :

I simply initialized n_info with 0 and it seems it worked fine this time. I'm not in front of my monitor, but I was able to get a picture with import: http://rsalveti.net/tmp/unity.png

Will get the patch done right this time and will push a new package to the overlay.

I also copied the new libnux-graphics to http://rsalveti.net/tmp/libnux-graphics-1.0.so.0.1600.0 in case someone wants to test it. Just replace with the one installed by libnux and in theory unity should just work.

Revision history for this message
Jesse Barker (jesse-barker) wrote :

The n_info initialization is the critical piece. However, "monitor" may go uninitialized for the section at the bottom when the window manager property "data" is updated (so it may work, but may not be quite what you want).

Revision history for this message
Travis Watkins (amaranth) wrote :

Fix has been committed to nux-gles2 branch, will verify it works on panda in a moment. Does not seem to cause a regression on desktop at least.

Changed in unity-gles:
status: Confirmed → Fix Committed
Fathi Boudra (fboudra)
Changed in linaro-ubuntu:
status: Confirmed → Fix Committed
Changed in linaro-ubuntu:
status: Fix Committed → Fix Released
Changed in unity-gles:
status: Fix Committed → Fix Released
Revision history for this message
Ilias Biris (ibiris) wrote :

Didier does this still affect you? See the released fix made in 11.10

Changed in unity:
status: New → Opinion
Revision history for this message
Ilias Biris (ibiris) wrote :

Didier does this still affect you? See fix made in 11.10

Changed in nux (Ubuntu):
status: New → Opinion
Changed in unity-gles:
milestone: none → 2011.10
Revision history for this message
Ilias Biris (ibiris) wrote :

Ricardo do you see this is still an issue for Nux? Or can we close it?

Changed in nux:
status: New → Opinion
Revision history for this message
Didier Roche-Tolomelli (didrocks) wrote :

@ibris: Opinion is not the right status for asking this. Incomplete is.
In addition, no new Nux or Unity have been pushed to oneiric since the day this bug was reported 2011-10-23, so I would think it's still an issue for the reporter.

Changed in nux:
status: Opinion → New
Changed in unity:
status: Opinion → New
Changed in nux (Ubuntu):
status: Opinion → New
Revision history for this message
Ilias Biris (ibiris) wrote :

@didrocks: thanks, I was ambivalent while reading help.launchpad.net which status to choose ; 'opinion' seemed a better choice.

As for the issue you mention, the reporter also was assigned to release a new version for the Linaro Oneiric LEB. Which he did. Anyway I am also asking the reporter to confirm if it is still an issue for him, in which case we can look at it.

Changed in nux:
status: New → Incomplete
Changed in unity:
status: New → Incomplete
Changed in nux (Ubuntu):
status: New → Incomplete
Revision history for this message
Ricardo Salveti (rsalveti) wrote : Re: [Bug 880486] Re: Oneiric LEB: Unity fails to start with a segfault at libnux

On Fri, Nov 25, 2011 at 6:11 AM, Ilias Biris <email address hidden> wrote:
> @didrocks: thanks, I was ambivalent while reading help.launchpad.net
> which status to choose ; 'opinion' seemed a better choice.
>
> As for the issue you mention, the reporter also was assigned to release
> a new version for the Linaro Oneiric LEB. Which he did. Anyway I am also
> asking the reporter to confirm if it is still an issue for him, in which
> case we can look at it.
>
> ** Changed in: nux
>       Status: New => Incomplete
>
> ** Changed in: unity
>       Status: New => Incomplete
>
> ** Changed in: nux (Ubuntu)
>       Status: New => Incomplete

The fix was included at the Ubuntu LEB, but not yet sure if it's part
of nux upstream. Travis or didrocks should probably know better.

Here's the patch that was used for our release:
http://bazaar.launchpad.net/~linaro-maintainers/nux/overlay/revision/301

Revision history for this message
Travis Watkins (amaranth) wrote :

This patch went upstream at the start of the month, http://bazaar.launchpad.net/~unity-team/nux/2.0/revision/515 is the relevant commit.

Changed in nux (Ubuntu):
status: Incomplete → Fix Released
Changed in nux:
status: Incomplete → Fix Released
Changed in unity:
status: Incomplete → Fix Released
Changed in unity (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.