kernel BUG at /build/buildd/linux- 2.6.32/kernel/cred.c:168

Bug #606892 reported by Anthony Uk / dataway GmbH
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Linux
Expired
High
linux (Ubuntu)
Incomplete
Undecided
Unassigned

Bug Description

The server locked up at the console (no response even to sysrq+space). Shortly beforehand the serial console displayed the message:

Jul 17 01:15:07 byron kernel: [372978.128744] ------------[ cut here ]------------
Jul 17 01:15:07 byron kernel: [372978.144685] kernel BUG at /build/buildd/linux-2.6.32/kernel/cred.c:168!
Jul 17 01:15:07 byron kernel: [372978.162519] invalid opcode: 0000 [#4] SMP
Jul 17 01:15:07 byron kernel: [372978.177710] last sysfs file: /sys/devices/system/cpu/cpu9/cpufreq/cpuinfo_min_freq
Jul 17 01:15:07 byron kernel: [372978.206877] CPU 13
Jul 17 01:15:07 byron kernel: [372978.219599] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables kvm_intel kvm 8021q garp nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc fbcon tileblit font bridge bitblit stp softcursor vga16fb vgastate ipmi_si ipmi_devintf ioatdma ipmi_msghandler psmouse serio_raw joydev usbhid hid 3w_9xxx igb dca
Jul 17 01:15:07 byron kernel: [372978.238027] Pid: 25258, comm: munin-node Tainted: G D 2.6.32-23-server #37-Ubuntu X8DTU
Jul 17 01:15:07 byron kernel: [372978.238030] RIP: 0010:[<ffffffff8108b48e>] [<ffffffff8108b48e>] __put_cred+0x3e/0x50
Jul 17 01:15:07 byron kernel: [372978.238040] RSP: 0018:ffff8802afb27ed8 EFLAGS: 00010202
Jul 17 01:15:07 byron kernel: [372978.238043] RAX: 0000000000000001 RBX: ffff880c1e99e6c0 RCX: 00000000ffffffff
Jul 17 01:15:07 byron kernel: [372978.238045] RDX: 00000000ffffffff RSI: ffff880c1e99e6c0 RDI: ffff880c1e99e6c0
Jul 17 01:15:07 byron kernel: [372978.238051] RBP: ffff8802afb27ed8 R08: 0000000000000000 R09: ffff880b5de51680
Jul 17 01:15:07 byron kernel: [372978.238054] R10: 00000000000000c0 R11: 0000000000000000 R12: ffff880c1e99f8c0
Jul 17 01:15:07 byron kernel: [372978.238056] R13: ffff880253f20000 R14: 0000000000000001 R15: 0000000000000001
Jul 17 01:15:07 byron kernel: [372978.238059] FS: 00007f8d72446700(0000) GS:ffff8806554a0000(0000) knlGS:0000000000000000
Jul 17 01:15:07 byron kernel: [372978.238062] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 17 01:15:07 byron kernel: [372978.238064] CR2: 00007f8d71545240 CR3: 0000000253fdc000 CR4: 00000000000026e0
Jul 17 01:15:07 byron kernel: [372978.238067] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 17 01:15:07 byron kernel: [372978.238069] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jul 17 01:15:07 byron kernel: [372978.238072] Process munin-node (pid: 25258, threadinfo ffff8802afb26000, task ffff880253f20000)
Jul 17 01:15:07 byron kernel: [372978.238074] Stack:
Jul 17 01:15:07 byron kernel: [372978.238075] ffff8802afb27f08 ffffffff8108bb01 ffff8802afb27f08 ffff880c1e99f8c0
Jul 17 01:15:07 byron kernel: [372978.238078] <0> ffff880c1e99ee40 0000000000000001 ffff8802afb27f38 ffffffff8108cbb1
Jul 17 01:15:07 byron kernel: [372978.238082] <0> 0000000000000001 0000000001a2a960 0000000001a2a960 ffff880c1e99ee40
Jul 17 01:15:07 byron kernel: [372978.238085] Call Trace:
Jul 17 01:15:07 byron kernel: [372978.238090] [<ffffffff8108bb01>] commit_creds+0x121/0x1e0
Jul 17 01:15:07 byron kernel: [372978.238094] [<ffffffff8108cbb1>] set_current_groups+0x41/0x70
Jul 17 01:15:07 byron kernel: [372978.238098] [<ffffffff8108cdd9>] sys_setgroups+0x109/0x130
Jul 17 01:15:07 byron kernel: [372978.238106] [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b
Jul 17 01:15:07 byron kernel: [372978.238108] Code: 8b 04 25 c0 cb 00 00 48 3b b8 50 04 00 00 74 23 48 3b b8 48 04 00 00 74 16 48 83 ef 80 48 c7 c6 f0 b4 08 81 e8 04 e8 03 00 c9 c3 <0f> 0b eb fe 0f 0b eb fe 0f 0b eb fe 66 0f 1f 44 00 00 55 48 89
Jul 17 01:15:07 byron kernel: [372978.238129] RIP [<ffffffff8108b48e>] __put_cred+0x3e/0x50
Jul 17 01:15:07 byron kernel: [372978.238132] RSP <ffff8802afb27ed8>
Jul 17 01:15:07 byron kernel: [372978.245944] ---[ end trace 10ea347dd48afe27 ]---

Kernel version: Linux byron 2.6.32-23-server #37-Ubuntu SMP Fri Jun 11 09:11:11 UTC 2010 x86_64 GNU/Linux

The same bug appears to be known to redhat and they propose a fix.

https://bugzilla.redhat.com/show_bug.cgi?id=591015
---
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access /dev/snd/: No such file or directory
AplayDevices: Error: [Errno 2] No such file or directory
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 10.04
Frequency: Once every few weeks.
InstallationMedia: Ubuntu-Server 10.04 LTS "Lucid Lynx" - Release amd64 (20100427)
MachineType: Supermicro X8DTU
Package: linux (not installed)
PciMultimedia:

ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-24-server root=UUID=22353945-73e0-4598-8621-615ed9e44f6e ro console=tty0 console=ttyS1,115200n8
ProcEnviron:
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-24.41-server 2.6.32.15+drm33.5
Regression: No
Reproducible: No
Tags: lucid kconfig needs-upstream-testing
Uname: Linux 2.6.32-24-server x86_64
UserGroups:

dmi.bios.date: 03/01/2010
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 1.1a
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: X8DTU
dmi.board.vendor: Supermicro
dmi.board.version: 1234567890
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 17
dmi.chassis.vendor: Supermicro
dmi.chassis.version: 1234567890
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1.1a:bd03/01/2010:svnSupermicro:pnX8DTU:pvr1234567890:rvnSupermicro:rnX8DTU:rvr1234567890:cvnSupermicro:ct17:cvr1234567890:
dmi.product.name: X8DTU
dmi.product.version: 1234567890
dmi.sys.vendor: Supermicro

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Anthony,

Please be sure to confirm this issue exists with the latest development release of Ubuntu. ISO CD images are available from http://cdimage.ubuntu.com/daily/current/ . If the issue remains, please run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 606892

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Anthony Uk / dataway GmbH (dataway) wrote :

Dear Jeremy

Unfortunately I do not know how to replicate this bug so could not really test properly, also the server is in production so it would be a breach of contract to run a development kernel on it.

If the bug does reappear I will be sure to do apport-collect, however does that command still work properly after a hard reboot?

Revision history for this message
Brian J. Murrell (brian-interlinx) wrote :

I've been seeing this sort of thing on 2.6.32-24 also. Filed an upstream bug about it.

Revision history for this message
Anthony Uk / dataway GmbH (dataway) wrote :

I upgraded the kernel to the latest version:

uname -a
Linux byron 2.6.32-24-server #41-Ubuntu SMP Thu Aug 19 02:47:08 UTC 2010 x86_64 GNU/Linux

and within a few hours the issue has reappeared. I attach a log of the console output. Again the kernel locked up, unable even to respond to Magic SysRq.

I notice that the same as last time, the process munin-node is implicated. I will disable munin-node for now, but clearly a user-mode process should not be able to lock up the kernel.

[20265.245221] ------------[ cut here ]------------
[20265.260159] kernel BUG at /build/buildd/linux-2.6.32/kernel/cred.c:168!
[20265.276697] invalid opcode: 0000 [#1] SMP
[20265.290483] last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
[20265.317858] CPU 9
[20265.330789] Modules linked in: ip6table_filter ip6_tables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables nfsd kvm_intel lockd nfs_acl kvm auth_rpcgss sunrpc exportfs 8021q garp fbcon tileblit bridge font bitblit stp softcursor vga16fb vgastate ipmi_si psmouse ipmi_devintf serio_raw ioatdma ipmi_msghandler joydev usbhid hid igb 3w_9xxx dca
[20265.372725] Pid: 30950, comm: munin-node Not tainted 2.6.32-24-server #41-Ubuntu X8DTU
[20265.372727] RIP: 0010:[<ffffffff8108b48e>] [<ffffffff8108b48e>] __put_cred+0x3e/0x50
[20265.372736] RSP: 0018:ffff8809a3537ed8 EFLAGS: 00010202
[20265.372738] RAX: 0000000000000001 RBX: ffff880622f68fc0 RCX: 00000000ffffffff
[20265.372740] RDX: 00000000ffffffff RSI: ffff880622f68fc0 RDI: ffff880622f68fc0
[20265.372742] RBP: ffff8809a3537ed8 R08: 0000000000000000 R09: ffff880c225d78e0
[20265.372745] R10: 00000000000000c0 R11: 0000000000000000 R12: ffff880622f683c0
[20265.372747] R13: ffff880c21d496f0 R14: 0000000000000001 R15: 0000000000000001
[20265.372750] FS: 00007f5012e9a700(0000) GS:ffff880016ea0000(0000) knlGS:0000000000000000
[20265.372752] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[20265.372755] CR2: 00007f5011f99240 CR3: 00000009a3b31000 CR4: 00000000000026e0
[20265.372757] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

Revision history for this message
Anthony Uk / dataway GmbH (dataway) wrote :

I notice that there is an upstream patch for this precise issue which I suggest be put into the Ubuntu kernel.

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=de09a9771a5346029f4d11e4ac886be7f9bfdd75

I am therefore removing the needs-upstream-testing flag in the hope that this issue can now be dealt with by Ubuntu.

tags: removed: needs-kernel-logs needs-upstream-testing
Revision history for this message
Anthony Uk / dataway GmbH (dataway) wrote : BootDmesg.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
Anthony Uk / dataway GmbH (dataway) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Anthony Uk / dataway GmbH (dataway) wrote : Lspci.txt

apport information

Revision history for this message
Anthony Uk / dataway GmbH (dataway) wrote : Lsusb.txt

apport information

Revision history for this message
Anthony Uk / dataway GmbH (dataway) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Anthony Uk / dataway GmbH (dataway) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Anthony Uk / dataway GmbH (dataway) wrote : ProcModules.txt

apport information

Revision history for this message
Anthony Uk / dataway GmbH (dataway) wrote : UdevDb.txt

apport information

Revision history for this message
Anthony Uk / dataway GmbH (dataway) wrote : UdevLog.txt

apport information

Revision history for this message
Andy Hauser (andy-ubuntu-bugzilla) wrote :

Same here once on one of seven compute nodes after over a week of computing
with that particular kernel. That certainly shows it's hard to reproduce.

Revision history for this message
petter wahlman (petter-wahlman) wrote :
Download full text (3.3 KiB)

I am affected by this bug on a daily basis with the most recent Ubuntu kernel.
Running sudo seem to trigger the problem.

2.6.32-26-server #48-Ubuntu SMP:

[ 294.112155] ------------[ cut here ]------------
[ 294.112159] kernel BUG at /build/buildd/linux-2.6.32/kernel/cred.c:168!
[ 294.112161] invalid opcode: 0000 [#1] SMP
[ 294.112163] last sysfs file: /sys/devices/pci0000:00/0000:00:1c.1/0000:03:00.0/net/wlan0/operstate
[ 294.112165] CPU 0
[ 294.112166] Modules linked in: binfmt_misc ppdev parport_pc ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp kvm_intel kvm nls_utf8 cifs nfsd exportfs nfs lockd nfs_acl auth_rpcgss fbcon tileblit font bitblit softcursor vga16fb vgastate snd_hda_codec_conexant thinkpad_acpi sunrpc arc4 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy pcmcia snd_seq_oss snd_seq_midi iwlagn snd_rawmidi radeon iwlcore ttm snd_seq_midi_event drm_kms_helper snd_seq mac80211 snd_timer snd_seq_device uvcvideo yenta_socket ricoh_mmc videodev v4l1_compat v4l2_compat_ioctl32 rsrc_nonstatic pcmcia_core snd psmouse serio_raw cfg80211 sdhci_pci sdhci led_class nvram tpm_tis tpm tpm_bios drm i2c_algo_bit video output soundcore snd_page_alloc intel_agp lp parport ohci1394 dm_raid45 usb_storage xor ieee1394 e1000e ahci
[ 294.112216] Pid: 15091, comm: sudo Not tainted 2.6.32-26-server #48-Ubuntu 40613VG
[ 294.112218] RIP: 0010:[<ffffffff8108a64e>] [<ffffffff8108a64e>] __put_cred+0x3e/0x50
[ 294.112225] RSP: 0018:ffff8800bdaa9f18 EFLAGS: 00010202
[ 294.112226] RAX: 0000000000000001 RBX: ffff880102bedb40 RCX: 00000000ffffffff
[ 294.112228] RDX: 00000000ffffffff RSI: ffff880102bedb40 RDI: ffff880102bedb40
[ 294.112229] RBP: ffff8800bdaa9f18 R08: 0000000000000000 R09: 0000000000000000
[ 294.112231] R10: 0000000000000000 R11: 0000000000000246 R12: ffff880102bed0c0
[ 294.112233] R13: ffff8801183a96e0 R14: 0000000000000000 R15: 0000000000000001
[ 294.112235] FS: 00002b34d4f18c00(0000) GS:ffff880005600000(0000) knlGS:0000000000000000
[ 294.112236] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 294.112238] CR2: 00002b34d49a5355 CR3: 00000000ba0cc000 CR4: 00000000000426f0
[ 294.112240] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 294.112241] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 294.112243] Process sudo (pid: 15091, threadinfo ffff8800bdaa8000, task ffff8801183a96e0)
[ 294.112244] Stack:
[ 294.112245] ffff8800bdaa9f48 ffffffff8108acc1 ffff8800bdaa9f38 ffff880102bed0c0
[ 294.112248] <0> 0000000000000000 ffff880102bedb40 ffff8800bdaa9f78 ffffffff8107cd3a
[ 294.112251] <0> 0000000000000000 000000000041e992 00000000016bfd00 0000000000000001
[ 294.112254] Call Trace:
[ 294.112257] [<ffffffff8108acc1>] commit_creds+0x121/0x1e0
[ 294.112260] [<ffffffff8107cd3a>] sys_setuid+0xda/0x100
[ 294.112264] [<ffffffff810121b2>] system_call_fastpath+0x16/0x1b
[ 294.112265] Code: 8b 04 25 c0 cb 00 00 48 3b b8 48 04 00 00 74 23 48 3b b8 40 04 00 00 74 16 48 83 ef 80 48 c7 c6 b0 a6 08 81 e8 74 e9 03 00 c9 c3 <0f> 0b eb fe 0f 0b eb fe...

Read more...

Revision history for this message
petter wahlman (petter-wahlman) wrote :

while true; do
   sudo ls;
done

immediately causes an Oops.

Regards,

-p.

Changed in linux:
status: Unknown → Confirmed
Revision history for this message
Jamie Jamison (jamie-jamison) wrote :
Download full text (3.5 KiB)

I'm seeing the same thing that Peter Wahlman reports. Here's the output from one of my servers:

Jan 19 16:23:13 kernel: [640271.778354] ------------[ cut here ]------------
Jan 19 16:23:13 kernel: [640271.783516] kernel BUG at /build/buildd/linux-2.6.32/kernel
/cred.c:168!
Jan 19 16:23:13 kernel: [640271.788785] invalid opcode: 0000 [#1] SMP
Jan 19 16:23:13 kernel: [640271.794625] last sysfs file: /sys/devices/pci0000:00/0000:0
0:1c.4/0000:02:00.1/irq
Jan 19 16:23:13 kernel: [640271.806738] CPU 2
Jan 19 16:23:13 kernel: [640271.813323] Modules linked in: nfp kvm_intel kvm bridge stp
 fbcon lp tileblit font bitblit softcursor vga16fb vgastate bnx2 power_meter parport dell_wmi shpchp
dcdbas xfs exportfs mptsas mptscsih mptbase scsi_transport_sas
Jan 19 16:23:13 kernel: [640271.839128] Pid: 22152, comm: sudo Not tainted 2.6.32-27-se
rver #49-Ubuntu PowerEdge R310
Jan 19 16:23:13 kernel: [640271.857099] RIP: 0010:[<ffffffff8108a7ae>] [<ffffffff8108a
7ae>] __put_cred+0x3e/0x50
Jan 19 16:23:13 kernel: [640271.877475] RSP: 0018:ffff8800b187df08 EFLAGS: 00010202
Jan 19 16:23:13 kernel: [640271.888184] RAX: 0000000000000001 RBX: ffff88010300f980 RCX
: 00000000ffffffff
Jan 19 16:23:13 kernel: [640271.911071] RDX: 00000000ffffffff RSI: ffff88010300f980 RDI
: ffff88010300f980
Jan 19 16:23:13 kernel: [640271.935858] RBP: ffff8800b187df08 R08: 0000000000000000 R09
: ffff8800b1b60040
Jan 19 16:23:13 kernel: [640271.962432] R10: 00007ffc5d188f98 R11: 0000000000000246 R12
: ffff88010300f5c0
Jan 19 16:23:13 kernel: [640271.989093] R13: ffff880232e044a0 R14: ffff88010300f980 R15
: 0000000000000001
Jan 19 16:23:13 kernel: [640272.016052] FS: 00007ffc5d7b0700(0000) GS:ffff880008e40000
(0000) knlGS:0000000000000000
Jan 19 16:23:13 kernel: [640272.043282] CS: 0010 DS: 0000 ES: 0000 CR0: 00000000800500
33
Jan 19 16:23:13 kernel: [640272.057416] CR2: 00007ffc5d710000 CR3: 00000000a334a000 CR4
: 00000000000026e0
Jan 19 16:23:13 kernel: [640272.085660] DR0: 0000000000000000 DR1: 0000000000000000 DR2
: 0000000000000000
Jan 19 16:23:13 kernel: [640272.114818] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7
: 0000000000000400
Jan 19 16:23:13 kernel: [640272.144811] Process sudo (pid: 22152, threadinfo ffff8800b1
87c000, task ffff880232e044a0)
Jan 19 16:23:13 kernel: [640272.176015] Stack:
Jan 19 16:23:13 kernel: [640272.191270] ffff8800b187df38 ffffffff8108ae21 ffff8800b187
df38 ffff88010300f5c0
Jan 19 16:23:13 kernel: [640272.206914] <0> 0000000000000000 00000000ffffffff ffff8800b
187df78 ffffffff8107c9e6
Jan 19 16:23:13 kernel: [640272.237830] <0> ffff8800b187df78 0000000000000004 000000000
0000004 00007fff476ba6a0
Jan 19 16:23:13 kernel: [640272.283543] Call Trace:
Jan 19 16:23:13 kernel: [640272.298684] [<ffffffff8108ae21>] commit_creds+0x121/0x1e0
Jan 19 16:23:13 kernel: [640272.313701] [<ffffffff8107c9e6>] sys_setregid+0xe6/0x140
Jan 19 16:23:13 kernel: [640272.328061] [<ffffffff810121b2>] system_call_fastpath+0x16
/0x1b
Jan 19 16:23:13 kernel: [640272.342596] Code: 8b 04 25 c0 cb 00 00 48 3b b8 48 04 00 00
 74 23 48 3b b8 40 04 00 00 74 16 48 83 ef 80 48 c7 c6 10 a8 08 81 e8 a4 e8 03 00 c9 c3 <0f> 0b eb fe
 0f 0b eb fe 0f 0...

Read more...

Revision history for this message
Jamie Jamison (jamie-jamison) wrote :

I can also replicate this by running

while true; do
  sudo ls
done

Changed in linux:
importance: Unknown → High
Changed in linux:
status: Confirmed → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.