hald segfaulting

Bug #8707 reported by Matt Zimmerman
12
Affects Status Importance Assigned to Milestone
hal (Ubuntu)
Fix Released
High
Martin Pitt

Bug Description

Program received signal SIGSEGV, Segmentation fault.
0x401a9a23 in strlen () from /lib/tls/i686/cmov/libc.so.6
(gdb) bt
#0 0x401a9a23 in strlen () from /lib/tls/i686/cmov/libc.so.6
#1 0x0805c7fe in pci_device_pre_process (self=0x80792c0, d=0x8197210,
    sysfs_path=0x81a3c50 "/sys/devices/pci0000:00/0000:00:1e.0/0000:02:02.0",
    device=0x81a24e8) at linux/pci_bus_device.c:417
#2 0x0805b1f1 in bus_device_got_parent (store=0x809fda0, parent=0x0,
    user_data=0x81a3bf8) at linux/bus_device.c:206
#3 0x0805b059 in bus_device_visit (self=0x80792c0,
    path=0x80f58b8 "��\r\b\005", device=0x81a2310) at linux/bus_device.c:137
#4 0x080599d1 in add_device (given_sysfs_path=0x0, subsystem=0x80c5158 "pci",
    msg=0x0) at linux/osspec.c:961
#5 0x08059162 in process_coldplug_list () at linux/osspec.c:755
#6 0x0804c04f in hald_marshal_VOID__OBJECT_BOOLEAN (closure=0x8138cb0,
    return_value=0x0, n_param_values=3, param_values=0xbffff340,
    invocation_hint=0xbffff238, marshal_data=0x0) at hald_marshal.c:122
#7 0x400297eb in g_closure_invoke () from /usr/lib/libgobject-2.0.so.0
#8 0x4003be02 in g_signal_emit_by_name () from /usr/lib/libgobject-2.0.so.0
#9 0x4003ae5c in g_signal_emit_valist () from /usr/lib/libgobject-2.0.so.0
#10 0x4003b140 in g_signal_emit () from /usr/lib/libgobject-2.0.so.0
#11 0x0804ffc4 in hal_device_store_add (store=0x40055788, device=0x0)
    at device_store.c:183
#12 0x0805b0fd in bus_device_move_from_tdl_to_gdl (device=0x80f58b8,
    user_data=0x8140bb8) at linux/bus_device.c:165
#13 0x4003c2d2 in g_cclosure_marshal_VOID__VOID ()
---Type <return> to continue, or q <return> to quit---
   from /usr/lib/libgobject-2.0.so.0
#14 0x400297eb in g_closure_invoke () from /usr/lib/libgobject-2.0.so.0
#15 0x4003be02 in g_signal_emit_by_name () from /usr/lib/libgobject-2.0.so.0
#16 0x4003ae5c in g_signal_emit_valist () from /usr/lib/libgobject-2.0.so.0
#17 0x4003b140 in g_signal_emit () from /usr/lib/libgobject-2.0.so.0
#18 0x0804ec80 in hal_device_callouts_finished (device=0x0) at device.c:1030
#19 0x0804c581 in iochn_data (source=0x0, condition=G_IO_IN, user_data=0x0)
    at callout.c:240
#20 0x400a722f in g_vasprintf () from /usr/lib/libglib-2.0.so.0
#21 0x400847ed in g_main_depth () from /usr/lib/libglib-2.0.so.0
#22 0x40085818 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#23 0x40085b3a in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#24 0x40086113 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#25 0x0805142b in main (argc=3, argv=0xbffffb04) at hald.c:509

Revision history for this message
Brandon Hale (brandon) wrote :

Same here, hald 0.2.98 segfaults on startup.
Attaching an strace with hald --daemon=no, and also
output from hald --daemon=no --verbose=yes.

Revision history for this message
Brandon Hale (brandon) wrote :

Created an attachment (id=303)
strace

Revision history for this message
Brandon Hale (brandon) wrote :

Created an attachment (id=304)
verbose output

Revision history for this message
Martin Pitt (pitti) wrote :

(In reply to comment #1)
> Same here, hald 0.2.98 segfaults on startup.
> Attaching an strace with hald --daemon=no, and also
> output from hald --daemon=no --verbose=yes.

Hi Brandon!

Hmm, unfortunately this does not really help me to find the problem. I need to
debug hal on the machine where it crashes, since hal often processes strings
returned by particular PCI/USB devices. Do you have a machine where hal crashes
and which does not contain sensitive data, where you could allow me temporary
root access to debug this? If so, http://www.piware.de/mpitt-ssh.pub contains my
SSH public key, just append it to /root/.ssh/authorized_keys (and make sure that
the package openssh-server is installed).

Matt, Brandon, are your crashes really the same? If Brandon's hal reproducibly
crashes at startup, but Matt's crashes only from time to time, then these might
be two unrelated segfaults.

Matt, can you reproduce your crash as well? The immediate cause might be that
the string given to strlen is either NULL or not properly 0-terminated. But the
path attribute in stack line #3 does not look too healty as well.

Revision history for this message
Martin Pitt (pitti) wrote :

(In reply to comment #0)
> Program received signal SIGSEGV, Segmentation fault.
> #3 0x0805b059 in bus_device_visit (self=0x80792c0,
> path=0x80f58b8 "��\r\b\005", device=0x81a2310) at linux/bus_device.c:137
> #4 0x080599d1 in add_device (given_sysfs_path=0x0, subsystem=0x80c5158 "pci",
> msg=0x0) at linux/osspec.c:961

That's it: given_sysfs_path is NULL, but this parameter is copied and used in
add_device without checking.

Revision history for this message
Martin Pitt (pitti) wrote :

Created an attachment (id=332)
patch to check given_sysfs_device parameter before using it

Should be safe, the function already returns NULL in other error cases.

Revision history for this message
Matt Zimmerman (mdz) wrote :

Patch looks good

Revision history for this message
Martin Pitt (pitti) wrote :

Fixed in:
 hal (0.2.98-1ubuntu5) warty; urgency=low
 .
   * added patch nofail_nocaps: do not exit hald if capabilities cannot be
     installed (which happens on kernels which do not support capabilities),
     since only few features actually depend on additional capabilities
     (currently only the "link" detection of MII ethernet cards)
     (Warty bug #8721)
   * added patch fix_first_hotplug: the first hotplug event was sometimes not
     recognized properly, this patch should fix that. Thanks to Sjoerd Simons
     for finding it. (Warty bug #8689)
   * added patch add_device_nullarg: check whether given_sysfs_path is NULL and
     immediately return in this case; previously, this parameter was copied and
     compared without checking. (Warty bug #8707)

Revision history for this message
Martin Pitt (pitti) wrote :

Brandon, can you please check whether this fixes your crash as well? Thanks!

Revision history for this message
Martin Pitt (pitti) wrote :

Brandon's segfault had yet another cause: an unchecked pointer access in
pci_device_pre_process() in hald/linux/pci_bus_device.c, so I reopen this bug. I
have a patch ready and let Brandon test the new package.

Revision history for this message
Martin Pitt (pitti) wrote :

Created an attachment (id=339)
Patch to fix Brandon's segfault

Just adds NULL checking, should be very safe.

Revision history for this message
Martin Pitt (pitti) wrote :

Fixed in:
 hal (0.2.98-1ubuntu6) warty; urgency=low
 .
   * added patch pci_pre_process_check_null to fix yet another segfault due to
     an unchecked pointer access in pci_device_pre_process() in
     hald/linux/pci_bus_device.c.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.