ssh x11 forwarding precise to oneiric causes glibc malloc(): memory corruption

Bug #968218 reported by Robert Hutton
62
This bug affects 10 people
Affects Status Importance Assigned to Milestone
libxi (Ubuntu)
Fix Released
High
Unassigned
Lucid
Won't Fix
High
Unassigned
Oneiric
Fix Released
High
Unassigned
Precise
Fix Released
High
Unassigned

Bug Description

[Problem]
SSHing (with X11 forwarding enabled) from a Precise machine to an Oneiric machine and running certain X11 forwarded programs causes a crash of the program, either immediately or on the first mouse click on that program's window.

[Impact]
I have seen this on two client machines (a laptop and a desktop) running the latest precise release, connecting to either the oneiric desktop release or oneiric server release on the server side. I have also reproduced this with two VirtualBox VMs, connected together with host-only networking and with desktop releases of Oneiric and Precise installed.

[Development Fix]
Bug is a recognized upstream bug. When the xserver sends an unknown device class, pointers to incorrect chunks of memory are set up. The upstream patch fixes this by automatically skipping any unknown classes.

http://cgit.freedesktop.org/xorg/lib/libXi/commit/?h=libXi-1.4-branch&id=22e9ace88d57803ecda95db7c9355a614db1902a

This is fixed in Precise already.
Debian also picked up the patch: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=660411

[Stable Fix]
A backport of the above patch is provided in the following debdiff:
https://bugs.launchpad.net/ubuntu/+source/libxi/+bug/968218/+attachment/3053962/+files/libxi_1.4.3-3ubuntu1.1.debdiff

Backport from upstream commit 22e9ace88d on the 1.4 branch to not corrupt memory when the server sends unknown device classes. Minor changes were needed because of the XI2.1 ubuntu specific patch.

[Text Case]
- On client machine, install Precise with all updates as of 2012-03-29.
- On server machine, install Oneiric with all updates as of 2012-03-29.
- Set up host-only networking so that machines can ssh to each other.
- On client machine, "ssh -X" to the server machine. Then run an X11 application. Some applications will crash immediately or on the first mouse click on that application.

On my test VirtualBox setup, applications that always crash on first click:
- gnome-terminal
- nautilus
- aisleriot solitaire
- gnome-control-center
- file-roller
- brasero
- gcalctool
- palimpsest

Applications that do not crash:
- Libre Office
- gimp
- banshee
- firefox
- thunderbird

Obviously, this isn't an exhaustive list. When applications crash, they spew out a large error message. On my desktop machine, sshing in to an Oneiric Server install on a physical machine, the programs crash immediately without showing a window, but on my test setup with two VirtualBox VMs, you first have to click on the window to cause it to crash. Sample crash output is attched to this bug report in the crash_output.txt file.

[Regression Potential]
None known, after several months testing and usage in Precise as well as upstream. The patch does change how pointers and memory initialization is done, so bears the usual risks associated with any such change; notably one arg to copy_classes() changes type from int to pointer, but it's an internal function and all callers have been properly adjusted. The patch proposed for oneiric is a slightly modified version of what went into precise, but those changes were merely to make it apply against our patched xserver.

Things to look for in spotting potential regressions would be software or xserver crashes, with backtraces that pass through libxi functions. So, regressions would be fairly obvious with even light testing.

[Original Report]
SSHing (with X11 forwarding enabled) from a Precise machine to an Oneiric machine and running certain X11 forwarded programs causes a crash of the program, either immediately or on the first mouse click on that program's window.

I have seen this on two client machines (a laptop and a desktop) running the latest precise release, connecting to either the oneiric desktop release or oneiric server release on the server side. I have also reproduced this with two VirtualBox VMs, connected together with host-only networking and with desktop releases of Oneiric and Precise installed.

To reproduce:
- On client machine, install Precise with all updates as of 2012-03-29.
- On server machine, install Oneiric with all updates as of 2012-03-29.
- Set up host-only networking so that machines can ssh to each other.
- On client machine, "ssh -X" to the server machine. Then run an X11 application. Some applications will crash immediately or on the first mouse click on that application.

On my test VirtualBox setup, applications that always crash on first click:
- gnome-terminal
- nautilus
- aisleriot solitaire
- gnome-control-center
- file-roller
- brasero
- gcalctool
- palimpsest

Applications that do not crash:
- Libre Office
- gimp
- banshee
- firefox
- thunderbird

Obviously, this isn't an exhaustive list. When applications crash, they spew out a large error message. On my desktop machine, sshing in to an Oneiric Server install on a physical machine, the programs crash immediately without showing a window, but on my test setup with two VirtualBox VMs, you first have to click on the window to cause it to crash. Sample crash output is attched to this bug report in the crash_output.txt file.

OS and Software Versions:

Client:
=====
lsb_release -rd:
Description: Ubuntu precise (development branch)
Release: 12.04

apt-cache policy xorg:
xorg:
  Installed: 1:7.6+12ubuntu1
  Candidate: 1:7.6+12ubuntu1
  Version table:
 *** 1:7.6+12ubuntu1 0
        500 http://gb.archive.ubuntu.com/ubuntu/ precise/main amd64 Packages
        100 /var/lib/dpkg/status

Server:
======
lsb_release -rd:
Description: Ubuntu 11.10
Release: 11.10

apt-cache policy xorg:
xorg:
  Installed: 1:7.6+7ubuntu7.1
  Candidate: 1:7.6+7ubuntu7.1
  Version table:
 *** 1:7.6+7ubuntu7.1 0
        500 http://mirror.ox.ac.uk/sites/archive.ubuntu.com/ubuntu/ oneiric-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu/ oneiric-security/main amd64 Packages
        100 /var/lib/dpkg/status
     1:7.6+7ubuntu7 0
        500 http://mirror.ox.ac.uk/sites/archive.ubuntu.com/ubuntu/ oneiric/main amd64 Packages

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: xorg 1:7.6+12ubuntu1
ProcVersionSignature: Ubuntu 3.2.0-20.33-generic 3.2.12
Uname: Linux 3.2.0-20-generic x86_64
.tmp.unity.support.test.0:

ApportVersion: 1.95-0ubuntu1
Architecture: amd64
CompizPlugins: [core,composite,opengl,compiztoolbox,decor,vpswitch,snap,mousepoll,resize,place,move,wall,grid,regex,imgpng,session,gnomecompat,animation,fade,unitymtgrabhandles,workarounds,scale,expo,ezoom,unityshell]
CompositorRunning: compiz
Date: Thu Mar 29 13:32:19 2012
DistUpgraded: Fresh install
DistroCodename: precise
DistroVariant: ubuntu
DkmsStatus: virtualbox, 4.1.10, 3.2.0-20-generic, x86_64: installed
ExtraDebuggingInterest: Yes, whatever it takes to get this fixed in Ubuntu
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Beta amd64+mac (20120327.1)
MachineType: Dell Inc. OptiPlex 960
ProcEnviron:
 LANGUAGE=en_GB:en
 TERM=xterm
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-20-generic root=UUID=587e726d-6c02-4c7f-aec3-843dbfd68f4c ro quiet splash vt.handoff=7
SourcePackage: xorg
Symptom: display
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 07/31/2009
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A05
dmi.board.name: 0F428D
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 3
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA05:bd07/31/2009:svnDellInc.:pnOptiPlex960:pvr:rvnDellInc.:rn0F428D:rvrA00:cvnDellInc.:ct3:cvr:
dmi.product.name: OptiPlex 960
dmi.sys.vendor: Dell Inc.
version.compiz: compiz 1:0.9.7.2-0ubuntu4
version.ia32-libs: ia32-libs N/A
version.libdrm2: libdrm2 2.4.32-1ubuntu1
version.libgl1-mesa-dri: libgl1-mesa-dri 8.0.2-0ubuntu2
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 8.0.2-0ubuntu2
version.xserver-xorg-core: xserver-xorg-core 2:1.11.4-0ubuntu7
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.7.0-0ubuntu1
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.14.99~git20111219.aacbd629-0ubuntu2
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.17.0-1ubuntu4
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20111201+b5534a1-1build2

Revision history for this message
Robert Hutton (rwh-helms-deep) wrote :
description: updated
description: updated
Revision history for this message
Robert Hutton (rwh-helms-deep) wrote :

I've just tested SSHing from Precise to Natty and everything seems to work fine. So perhaps this is more truthfully a bug in Oneiric.

description: updated
Revision history for this message
Bryce Harrington (bryce) wrote :

Interesting. Offhand wonder if it's ABI incompatibilities somewhere in the libX11 chain.

Any chance we could get you to collect a full backtrace on this crash? - see http://wiki.ubuntu.com/X/Backtracing for guidance.

affects: xorg (Ubuntu) → xorg-server (Ubuntu)
Changed in xorg-server (Ubuntu):
importance: Undecided → High
status: New → Incomplete
Revision history for this message
Robert Hutton (rwh-helms-deep) wrote :

Hi Bryce,

I tried backtracing X11 using those instructions, but couldn't get X11 on either the Precise or the Oneiric side to give me a backtrace (because it didn't actually crash).

However, I managed to backtrace some of the applications that crashed, using the "Core Files" section of https://wiki.ubuntu.com/Backtrace (see attached).

Now that I think about it, it appears the apps that are crashing are GTK3, and the ones that aren't are GTK2...

-Rob

Revision history for this message
Robert Hutton (rwh-helms-deep) wrote :
Revision history for this message
Robert Hutton (rwh-helms-deep) wrote :
Revision history for this message
Robert Hutton (rwh-helms-deep) wrote :
Revision history for this message
Robert Hutton (rwh-helms-deep) wrote :
Revision history for this message
Robert Hooker (sarvatt) wrote :

Can you please give the libxi in this PPA a try on the oneiric machine?

https://launchpad.net/~sarvatt/+archive/green

It is a backport of http://cgit.freedesktop.org/xorg/lib/libXi/commit/?h=libXi-1.4-branch&id=22e9ace88d57803ecda95db7c9355a614db1902a which was shown to fix the same issue in other places, with minor changes to apply with our XI2.1 stuff in oneiric's libxi.

Revision history for this message
Mario Limonciello (superm1) wrote :

Hey Robert,

I actually was encountering this exact same crash with Virtualbox on an oneiric server when X forwarding from a precise machine. I installed that patched libxi6 on the oneiric server and it fixed the problem.

Thanks!

Robert Hooker (sarvatt)
affects: xorg-server (Ubuntu) → libxi (Ubuntu)
Changed in libxi (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Robert Hutton (rwh-helms-deep) wrote :

Hi Robert,

Great stuff! I did the following on the Oneiric machine:

sudo apt-add-repository ppa:sarvatt/green
sudo apt-get update
sudo apt-get upgrade

The following packages will be upgraded:
  aptdaemon aptdaemon-data libxi6 libxi6-dbg python-aptdaemon
  python-aptdaemon-gtk python-aptdaemon.gtk3widgets
  python-aptdaemon.gtkwidgets

sudo shutdown -r now

And now SSH X11 forwarding to that box from my desktop running Precise and my VirtualBox VM running Precise both work fine, at least for that list of applications above. :)

I hope this makes it into Oneiric before Precise is released; I suspect there'll be a whole lot more people hitting this bug if it's still there after release date. ;)

Thanks,

Rob

Revision history for this message
Robert Hooker (sarvatt) wrote :

Debdiff containing the fix

libxi (2:1.4.3-3ubuntu1.1) oneiric-proposed; urgency=low

  * Add libxi-unknown-device-class.patch: Backport from upstream commit
    22e9ace88d on the 1.4 branch to not corrupt memory when the server
    sends unknown device classes. Minor changes were needed because of
    the XI2.1 ubuntu specific patch. (LP: #968218)

 -- Robert Hooker <email address hidden> Mon, 02 Apr 2012 13:22:48 -0400

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "libxi_1.4.3-3ubuntu1.1.debdiff" of this bug report has been identified as being a patch in the form of a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-sponsors team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

should be fixed in precise already.

Changed in libxi (Ubuntu):
status: Triaged → Fix Released
Changed in libxi (Ubuntu Oneiric):
importance: Undecided → High
status: New → Triaged
Bryce Harrington (bryce)
description: updated
Revision history for this message
Bryce Harrington (bryce) wrote :

Adding a task for Lucid as per jcristau's recommendation:

<jcristau> it's still somewhere on my to-do list to fix it in squeeze... (http://bugs.debian.org/661652)
<ubottu> Debian bug 661652 in release.debian.org "pu: package libxi/2:1.3-7" [Normal,Open]
<jcristau> it affects squeeze so presumably also lucid

description: updated
Revision history for this message
Bryce Harrington (bryce) wrote :

Here's a modified version of Robert's patch to apply against the Lucid libxi.

Basically it's the same patch less the changes to the Touch cases, as those code branches aren't present in the Lucid version (it predates the Touch work).

Revision history for this message
Bryce Harrington (bryce) wrote :

Instead of the patch in #16 (which has an error), I found that the original upstream patch applies and builds fine against lucid's libxi 1.3. Comparing the upstream patch with what jcristau prepared for debian, they look functionally the same (just formatting changes). There are no Ubuntu changes to the package in lucid, so I think carrying the upstream patch is the cleaner solution.

I sponsored an upload Robert's debdiff of the oneiric-proposed package, and uploaded a package for lucid-proposed with the original patch.

It should be pretty obvious whether these work or not, but please do ensure both oneiric-proposed and lucid-proposed get tested for verification.

Changed in libxi (Ubuntu Oneiric):
status: Triaged → Fix Committed
Changed in libxi (Ubuntu Lucid):
importance: Undecided → High
status: New → Fix Committed
Revision history for this message
Bryce Harrington (bryce) wrote :

[Subbed ubuntu-sru, unsub ubuntu-sponsors.]

Revision history for this message
Clint Byrum (clint-fewbar) wrote : Please test proposed package

Hello Robert, or anyone else affected,

Accepted libxi into oneiric-proposed. The package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

tags: added: verification-needed
Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Hello Robert, or anyone else affected,

Accepted libxi into lucid-proposed. The package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Revision history for this message
WolverinePL (podusowski) wrote :

Installed libxi6/oneiric-proposed and corfirmed that X forwarding is working now.

tags: added: verification-done-oneiric
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libxi - 2:1.4.3-3ubuntu1.1

---------------
libxi (2:1.4.3-3ubuntu1.1) oneiric-proposed; urgency=low

  * Add libxi-unknown-device-class.patch: Backport from upstream commit
    22e9ace88d on the 1.4 branch to not corrupt memory when the server
    sends unknown device classes. Minor changes were needed because of
    the XI2.1 ubuntu specific patch. (LP: #968218)
 -- Robert Hooker <email address hidden> Mon, 02 Apr 2012 13:22:48 -0400

Changed in libxi (Ubuntu Oneiric):
status: Fix Committed → Fix Released
Revision history for this message
Henry Wertz (hwertz) wrote :

     Note, I see this bug between my Natty too... between my Natty netbook and my Gentoo desktop (once the X server upgraded to 1.12.99, and still with 1.13.0. Not sure why, because this computer just has a mouse and keyboard, no touch screen or touchpad to provide exciting new XInput events.) Installing libxi6_1.4.3-3ubuntu1.1_i386.deb from Oneiric-updates works fine (no dependency problems and fixes the bug.)

Revision history for this message
Brian Murray (brian-murray) wrote : Verification still needed

The fix for this bug has been awaiting testing feedback in the -proposed repository for lucid for more than 90 days. Please test this fix and update the bug appropriately with the results. In the event that the fix for this bug is still not verified 15 days from now, the package will be removed from the -proposed repository.

tags: added: removal-candidate
Revision history for this message
Brian Murray (brian-murray) wrote :

The version of libxi in lucid-proposed has been removed as this bug report was not verified in a timely fashion.

tags: removed: verification-needed
tags: removed: removal-candidate
Changed in libxi (Ubuntu Lucid):
status: Fix Committed → Triaged
Revision history for this message
Rolf Leggewie (r0lf) wrote :

lucid has seen the end of its life and is no longer receiving any updates. Marking the lucid task for this ticket as "Won't Fix".

Changed in libxi (Ubuntu Lucid):
status: Triaged → Won't Fix
To post a comment you must log in.