qemu squeeze crashes "BUG: unable to handle kernel NULL pointer dereference at (null)"

Bug #735752 reported by Aidar Kamalov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Invalid
Undecided
Unassigned

Bug Description

my virtual machine server (qemu+libvirt) regularly breaks down with such a record in the logs
I can not even ping the guest, but i can ping host, but can not do something with it (cannot ssh login for example)
And I dont know how to reproduce the problem :(

Mar 15 17:58:04 mainhost kernel: [65866.976982] BUG: unable to handle kernel NULL pointer dereference at (null)
Mar 15 17:58:04 mainhost kernel: [65866.977422] IP: [<ffffffff8100efbe>] 0xffffffff8100efbe
Mar 15 17:58:04 mainhost kernel: [65866.977663] PGD 7387b7067 PUD 81b723067 PMD 0.
Mar 15 17:58:04 mainhost kernel: [65866.977902] Oops: 0000 [#1] SMP.
Mar 15 17:58:04 mainhost kernel: [65866.978128] last sysfs file: /sys/devices/system/cpu/cpu3/topology/thread_siblings
Mar 15 17:58:04 mainhost kernel: [65866.978572] CPU 1.
Mar 15 17:58:04 mainhost kernel: [65866.978577] Modules linked in: nfs lockd nfs_acl auth_rpcgss sunrpc ebtable_nat ebtables coretemp bridge stp llc xt_state
Mar 15 17:58:04 mainhost kernel: [65866.979737].
Mar 15 17:58:04 mainhost kernel: [65866.979959] Pid: 3369, comm: qemu-system-x86 Not tainted 2.6.37-gentoo-r2 #2 Intel S5000VSA/S5000VSA
Mar 15 17:58:04 mainhost kernel: [65866.980085] RIP: 0010:[<ffffffff8100efbe>] [<ffffffff8100efbe>] 0xffffffff8100efbe
Mar 15 17:58:04 mainhost kernel: [65866.980085] RSP: 0018:ffff880738767a48 EFLAGS: 00010246
Mar 15 17:58:04 mainhost kernel: [65866.980085] RAX: 0000000000000000 RBX: fffffffffffff001 RCX: ffff88081cbeb948
Mar 15 17:58:04 mainhost kernel: [65866.980085] RDX: 0000000000000022 RSI: fffffffffffff001 RDI: ffff88081cbeb000
Mar 15 17:58:04 mainhost kernel: [65866.980085] RBP: 0000000000000001 R08: 00000000000fee01 R09: 0000000000000022
Mar 15 17:58:04 mainhost kernel: [65866.980085] R10: 0000008000000000 R11: ffffea0000000000 R12: ffff880818d83490
Mar 15 17:58:04 mainhost kernel: [65866.980085] R13: 00000000155e5000 R14: 0000000000000000 R15: 0000000000000100
Mar 15 17:58:04 mainhost kernel: [65866.980085] FS: 00007f5f25e4e700(0000) GS:ffff88009f680000(0000) knlGS:fffff80001175000
Mar 15 17:58:04 mainhost kernel: [65866.980085] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
Mar 15 17:58:04 mainhost kernel: [65866.980085] CR2: 0000000000000000 CR3: 0000000806be9000 CR4: 00000000000426e0
Mar 15 17:58:04 mainhost kernel: [65866.980085] DR0: 0000000000000045 DR1: 0000000000000000 DR2: 0000000000000000
Mar 15 17:58:04 mainhost kernel: [65866.980085] DR3: 0000000000000005 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar 15 17:58:04 mainhost kernel: [65866.980085] Process qemu-system-x86 (pid: 3369, threadinfo ffff880738766000, task ffff8808203ac360)
Mar 15 17:58:04 mainhost kernel: [65866.980085] Stack:
Mar 15 17:58:04 mainhost kernel: [65866.980085] 0000000000000000 ffff8806a30f3ff8 ffff880753980000 ffffffff8100f06f
Mar 15 17:58:04 mainhost kernel: [65866.980085] 0000000000000ff8 ffff8807705d6b40 0000000000000ff8 ffffffff810123f0
Mar 15 17:58:04 mainhost kernel: [65866.980085] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Mar 15 17:58:04 mainhost kernel: [65866.980085] Call Trace:
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff8100f06f>] ? 0xffffffff8100f06f
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff810123f0>] ? 0xffffffff810123f0
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff8100f744>] ? 0xffffffff8100f744
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff8100ffaf>] ? 0xffffffff8100ffaf
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff810011f1>] ? 0xffffffff810011f1
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff810142fc>] ? 0xffffffff810142fc
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff8100834d>] ? 0xffffffff8100834d
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff81011af6>] ? 0xffffffff81011af6
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff8100c5a7>] ? 0xffffffff8100c5a7
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff81003a85>] ? 0xffffffff81003a85
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff810e19b0>] ? 0xffffffff810e19b0
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff81078cd8>] ? 0xffffffff81078cd8
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff810e1a39>] ? 0xffffffff810e1a39
Mar 15 17:58:04 mainhost kernel: [65866.980085] [<ffffffff810267fb>] ? 0xffffffff810267fb
Mar 15 17:58:04 mainhost kernel: [65866.980085] Code: 8b 47 50 8d 50 01 85 c0 89 57 50 75 07 41 58 e9 32 ff ff ff 5f c3 55 89 d5 53 48 89 f3 48 83 ec 08 e8 d
Mar 15 17:58:04 mainhost kernel: [65866.980085] RIP [<ffffffff8100efbe>] 0xffffffff8100efbe
Mar 15 17:58:04 mainhost kernel: [65866.980085] RSP <ffff880738767a48>
Mar 15 17:58:04 mainhost kernel: [65866.980085] CR2: 0000000000000000
Mar 15 17:58:04 mainhost kernel: [65866.990753] ---[ end trace d147f74976c2654d ]---

Linux mainhost 2.6.37-gentoo-r2 #2 SMP Mon Mar 14 22:53:20 MSK 2011 x86_64 Intel(R) Xeon(R) CPU E5405 @ 2.00GHz GenuineIntel GNU/Linux

app-emulation/qemu-kvm-0.13.0-r2
app-emulation/libvirt-0.8.8-r1

mainhost log # ps ax|grep qemu
 2957 ? Sl 12:28 /usr/bin/qemu-system-x86_64 --enable-kvm -S -M pc-0.13 -enable-kvm -m 512 -smp 1,sockets=1,cores=1,threads=1 -name dc1 -uuid f090bfc9-1cab-e4e9-51ea-80c9cc775bff -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/dc1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -boot order=c,menu=off -drive file=/dev/vm-storage/dc1,if=none,id=drive-ide0-0-0,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=13,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:7e:a1:a7,bus=pci.0,addr=0x4 -usb -device usb-tablet,id=input0 -vnc 0.0.0.0:0,password -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3
 2982 ? Rl 10:34 /usr/bin/qemu-system-x86_64 --enable-kvm -S -M pc-0.13 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name transarchive -uuid b96a3481-1ad6-9329-387e-a1504a17d0ee -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/transarchive.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -boot order=c,menu=off -drive file=/dev/vm-storage/transarchive,if=none,id=drive-ide0-0-0,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=13,id=hostnet0,vhost=on,vhostfd=17 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a9:f8:06,bus=pci.0,addr=0x3 -usb -device usb-tablet,id=input0 -vnc 0.0.0.0:3,password -vga std -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4

Revision history for this message
Stefan Hajnoczi (stefanha) wrote : Re: [Qemu-devel] [Bug 735752] [NEW] qemu squeeze crashes "BUG: unable to handle kernel NULL pointer dereference at (null)"

On Tue, Mar 15, 2011 at 9:28 PM, Aidar Kamalov
<email address hidden> wrote:
> Mar 15 17:58:04 mainhost kernel: [65866.980085] Call Trace:
> Mar 15 17:58:04 mainhost kernel: [65866.980085]  [<ffffffff8100f06f>] ? 0xffffffff8100f06f
> Mar 15 17:58:04 mainhost kernel: [65866.980085]  [<ffffffff810123f0>] ? 0xffffffff810123f0
> Mar 15 17:58:04 mainhost kernel: [65866.980085]  [<ffffffff8100f744>] ? 0xffffffff8100f744
> Mar 15 17:58:04 mainhost kernel: [65866.980085]  [<ffffffff8100ffaf>] ? 0xffffffff8100ffaf

Please compile a kernel with debug information so that the call trace
has function names. The option is CONFIG_DEBUG_INFO and can be
reached via "Kernel hacking | Compile the kernel with debug info" in
make menuconfig.

Stefan

Revision history for this message
Aidar Kamalov (aidar-kamalov) wrote :

I have compiled with CONFIG_DEBUG_INFO, but core is not creating, i think i do all right:

mainhost log # zcat /proc/config.gz |grep CONFIG_DEBUG_INFO
CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_INFO_REDUCED is not set

mainhost log # cat /etc/security/limits.conf | grep core
# - core - limits the core file size (KB)
* soft core unlimited
mainhost log # ulimit -S -s
8192
mainhost log # ulimit -H -s
unlimited
mainhost log # ulimit -S -s 100000
mainhost log # ulimit -S -s
100000
mainhost log # cat /proc/sys/kernel/core_pattern
core
mainhost log # cat /proc/sys/kernel/core_uses_pid
0
mainhost log # ulimit -c
unlimited
mainhost log # yes & kill -ABRT `jobs -p`
[1] 3527
mainhost log #
[1]+ Aborted (core dumped) yes

Revision history for this message
Stefan Hajnoczi (stefanha) wrote : Re: [Qemu-devel] [Bug 735752] Re: qemu squeeze crashes "BUG: unable to handle kernel NULL pointer dereference at (null)"

On Wed, Mar 16, 2011 at 8:27 PM, Aidar Kamalov
<email address hidden> wrote:
> I have compiled with CONFIG_DEBUG_INFO, but core is not creating, i
> think i do all right:
>
> mainhost log # zcat /proc/config.gz |grep CONFIG_DEBUG_INFO
> CONFIG_DEBUG_INFO=y
> # CONFIG_DEBUG_INFO_REDUCED is not set

This looks fine

> mainhost log # cat /etc/security/limits.conf | grep core
> #        - core - limits the core file size (KB)
> *               soft    core            unlimited
> mainhost log # ulimit -S -s
> 8192
> mainhost log # ulimit -H -s
> unlimited
> mainhost log #  ulimit -S -s 100000
> mainhost log # ulimit -S -s
> 100000
> mainhost log # cat /proc/sys/kernel/core_pattern
> core
> mainhost log # cat /proc/sys/kernel/core_uses_pid
> 0
> mainhost log # ulimit -c
> unlimited
> mainhost log # yes & kill -ABRT `jobs -p`
> [1] 3527
> mainhost log #
> [1]+  Aborted         (core dumped) yes

These are core dump options for userspace processes. You originally
posted a kernel backtrace and that is not affected by core dump
options.

When the issue is triggered again you should see function names in the
"Call Trace:" section of the BUG output.

Stefan

Revision history for this message
Aidar Kamalov (aidar-kamalov) wrote :
Download full text (7.5 KiB)

Issue is trigered, but there are nothing changes, i will try to get core

Mar 16 21:50:29 mainhost kernel: [28123.087654] BUG: unable to handle kernel NULL pointer dereference at (null)
Mar 16 21:50:29 mainhost kernel: [28123.088106] IP: [<ffffffff8100fe7d>] 0xffffffff8100fe7d
Mar 16 21:50:29 mainhost kernel: [28123.088331] PGD 81a4f5067 PUD 81d08b067 PMD 0.
Mar 16 21:50:29 mainhost kernel: [28123.088555] Oops: 0000 [#1] SMP.
Mar 16 21:50:29 mainhost kernel: [28123.088776] last sysfs file: /sys/devices/system/cpu/cpu3/topology/thread_siblings
Mar 16 21:50:29 mainhost kernel: [28123.089218] CPU 2.
Mar 16 21:50:29 mainhost kernel: [28123.089223] Modules linked in: i5k_amb coretemp nfs lockd nfs_acl auth_rpcgss sunrpc ebtable_nat ebtables bridge stp llc
Mar 16 21:50:29 mainhost kernel: [28123.090526].
Mar 16 21:50:29 mainhost kernel: [28123.090526] Pid: 2900, comm: qemu-system-x86 Not tainted 2.6.38-gentoo #2 Intel S5000VSA/S5000VSA
Mar 16 21:50:29 mainhost kernel: [28123.090526] RIP: 0010:[<ffffffff8100fe7d>] [<ffffffff8100fe7d>] 0xffffffff8100fe7d
Mar 16 21:50:29 mainhost kernel: [28123.090526] RSP: 0018:ffff88081b9b5a68 EFLAGS: 00010246
Mar 16 21:50:29 mainhost kernel: [28123.090526] RAX: 0000000000000000 RBX: fffffffffffff001 RCX: 00000000000fee01
Mar 16 21:50:29 mainhost kernel: [28123.090526] RDX: 0000000000000022 RSI: fffffffffffff001 RDI: ffff8807be235000
Mar 16 21:50:29 mainhost kernel: [28123.090526] RBP: 0000000000000001 R08: 0000000000000022 R09: 0000000000000040
Mar 16 21:50:29 mainhost kernel: [28123.090526] R10: ffff88081b9b5be8 R11: 0000000000000000 R12: 0000000000000ff8
Mar 16 21:50:29 mainhost kernel: [28123.090526] R13: 0000000001cbf000 R14: 0000000000000013 R15: 0000000000000000
Mar 16 21:50:29 mainhost kernel: [28123.090526] FS: 00007fd821b70700(0000) GS:ffff88009f700000(0000) knlGS:000007fffffdd000
Mar 16 21:50:29 mainhost kernel: [28123.090526] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
Mar 16 21:50:29 mainhost kernel: [28123.090526] CR2: 0000000000000000 CR3: 000000081d245000 CR4: 00000000000426e0
Mar 16 21:50:29 mainhost kernel: [28123.090526] DR0: 0000000000000001 DR1: 0000000000000002 DR2: 000000...

Read more...

Revision history for this message
Aidar Kamalov (aidar-kamalov) wrote :

well, i has downgraded to 2.6.33 and system stable for 3 days yet..
system is halted on 2.6.36, 2,6.37 and 2.6.38 kernels

mainhost ~ # uptime
 12:38:30 up 3 days, 2:56, 2 users, load average: 0.00, 0.00, 0.04
mainhost ~ # uname -a
Linux mainhost 2.6.33-gentoo-r1 #4 SMP Tue Aug 24 09:53:21 MSD 2010 x86_64 Intel(R) Xeon(R) CPU E5405 @ 2.00GHz GenuineIntel GNU/Linux

Revision history for this message
Thomas Huth (th-huth) wrote :

Sounds like this was a kernel bug ... you should report those to the right kernel mailing lists instead. Anyway, closing this ticket now since there hasn't been any activity since a long time.

Changed in qemu:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.