19a2:0710 Kernel BUG at ffffffffa01021e7 [verbose debug info unavailable]; RIP: 0010:[<ffffffffa01021e7>] [<ffffffffa01021e7>] be_xmit+0x937/0x990 [be2net]

Bug #1202334 reported by Chris Read on 2013-07-17
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
High
Unassigned

Bug Description

I can repeatably generate the following kernel panic by generating reasonable network load:

[ 180.808508] ------------[ cut here ]------------
[ 180.813733] Kernel BUG at ffffffffa01021e7 [verbose debug info unavailable]
[ 180.821576] invalid opcode: 0000 [#1] SMP
[ 180.826322] Modules linked in: 8021q(F) garp(F) stp(F) mrp(F) llc(F) bnx2x libcrc32c(F) sfc mdio mtd be2net intel_powerclamp coretemp kvm_intel kvm crc32_pclmul(F) ghash_clmulni_intel(F) aesni_intel(F) aes_x86_64(F) lrw(F) gf128mul(F) glue_helper(F) ablk_helper(F) cryptd(F) gpio_ich i7core_edac edac_core microcode(F) lpc_ich serio_raw(F) mac_hid squashfs(F) aufs igb i2c_algo_bit dca bcache(F) ptp(F) ahci(F) libahci(F) hpsa pps_core(F)
[ 180.871949] CPU: 22 PID: 24227 Comm: beam.smp Tainted: GF I 3.10.0-3-generic #12-Ubuntu
[ 180.881832] Hardware name: HP ProLiant DL180 G6 , BIOS O20 09/01/2011
[ 180.889188] task: ffff880be98cddc0 ti: ffff880be4e06000 task.ti: ffff880be4e06000
[ 180.897615] RIP: 0010:[<ffffffffa01021e7>] [<ffffffffa01021e7>] be_xmit+0x937/0x990 [be2net]
[ 180.907261] RSP: 0000:ffff88183fd437a8 EFLAGS: 00010286
[ 180.913255] RAX: ffff8817f1d0d8d0 RBX: 0000000000000042 RCX: 0000000000000042
[ 180.921289] RDX: ffff8817fa221400 RSI: 0000000000e60000 RDI: 0000000000000001
[ 180.929326] RBP: ffff88183fd43818 R08: 00000000000000ff R09: 0000000000000000
[ 180.937361] R10: ffffffff815bd7e7 R11: ffffffff81c25260 R12: 0000000000000001
[ 180.945394] R13: 00000017fa2214ee R14: ffff8817fa0dd200 R15: ffff8817f1d0d8c0
[ 180.953456] FS: 00007fdf73254700(0000) GS:ffff88183fd40000(0000) knlGS:0000000000000000
[ 180.962614] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 180.969118] CR2: 00007fdf37fff000 CR3: 0000000bfb2fb000 CR4: 00000000000007e0
[ 180.977180] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 180.985243] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 180.993303] Stack:
[ 180.995626] ffff8817fb980000 00ff00100002d200 0000000200020100 ffff8817fceca098
[ 181.004239] ffff8817f1d00000 ffff881800000003 ffffffff815cd03f ffff8817fceca000
[ 181.012852] 0000ffff81cf8810 0000000000000042 ffff8817f1d00000 0000000000000012
[ 181.021464] Call Trace:
[ 181.024265] <IRQ>
[ 181.026463] [<ffffffff815cd03f>] ? dev_queue_xmit_nit+0x1af/0x220
[ 181.033799] [<ffffffff815cf8d8>] dev_hard_start_xmit+0x318/0x560
[ 181.040697] [<ffffffff815ed086>] sch_direct_xmit+0xe6/0x1c0
[ 181.047106] [<ffffffff815cfd1e>] dev_queue_xmit+0x1fe/0x4c0
[ 181.053518] [<ffffffffa014dc4e>] vlan_dev_hard_start_xmit+0x8e/0x120 [8021q]
[ 181.061580] [<ffffffff815cf8d8>] dev_hard_start_xmit+0x318/0x560
[ 181.068476] [<ffffffff815cfe63>] dev_queue_xmit+0x343/0x4c0
[ 181.074889] [<ffffffff815bf8ae>] ? __alloc_skb+0x7e/0x2b0
[ 181.081105] [<ffffffff81605de9>] ip_finish_output+0x2c9/0x380
[ 181.087709] [<ffffffff81607188>] ip_output+0x58/0x90
[ 181.093437] [<ffffffff816068f5>] ip_local_out+0x25/0x30
[ 181.099458] [<ffffffff81606c4a>] ip_queue_xmit+0x14a/0x3f0
[ 181.105773] [<ffffffff8161e18d>] tcp_transmit_skb+0x44d/0x8b0
[ 181.112378] [<ffffffff81620dc4>] tcp_send_ack+0xa4/0xf0
[ 181.118399] [<ffffffff8161417e>] __tcp_ack_snd_check+0x5e/0xa0
[ 181.125102] [<ffffffff8161af0a>] tcp_rcv_established+0x3aa/0x8a0
[ 181.131999] [<ffffffff81625794>] tcp_v4_do_rcv+0x1b4/0x470
[ 181.138312] [<ffffffff81626db6>] tcp_v4_rcv+0x676/0x7d0
[ 181.144337] [<ffffffff816018d8>] ip_local_deliver_finish+0xe8/0x230
[ 181.151525] [<ffffffff81601bb7>] ip_local_deliver+0x47/0x80
[ 181.157935] [<ffffffff81601548>] ip_rcv_finish+0x78/0x320
[ 181.164150] [<ffffffff81601e09>] ip_rcv+0x219/0x360
[ 181.169782] [<ffffffff815cdce6>] __netif_receive_skb_core+0x656/0x820
[ 181.177165] [<ffffffff815cdec8>] __netif_receive_skb+0x18/0x60
[ 181.183866] [<ffffffff815cdf33>] netif_receive_skb+0x23/0x90
[ 181.190372] [<ffffffff815ce03c>] napi_gro_complete+0x9c/0xd0
[ 181.196880] [<ffffffff815ce0dd>] napi_gro_flush+0x6d/0x90
[ 181.204426] [<ffffffff815ce11e>] napi_complete+0x1e/0x50
[ 181.210545] [<ffffffffa0105703>] be_poll+0x7a3/0x830 [be2net]
[ 181.217149] [<ffffffff815ce26c>] net_rx_action+0x11c/0x240
[ 181.223466] [<ffffffff8105f6f7>] __do_softirq+0xf7/0x240
[ 181.229584] [<ffffffff8105f9ed>] irq_exit+0xcd/0xe0
[ 181.235217] [<ffffffff816df056>] do_IRQ+0x56/0xc0
[ 181.240655] [<ffffffff816d4d2d>] common_interrupt+0x6d/0x6d
[ 181.247063] <EOI>
[ 181.249264] Code: 08 e9 76 fc ff ff 4c 8b 1d 47 82 b1 e1 e9 f1 fd ff ff 48 8b 15 3b 82 b1 e1 e9 53 fe ff ff 4c 89 f7 e8 6e e8 4b e1 e9 41 f7 ff ff <0f> 0b 80 7e 06 11 0f 94 c1 eb bf be 74 07 00 00 48 c7 c7 d0 14
[ 181.274576] RIP [<ffffffffa01021e7>] be_xmit+0x937/0x990 [be2net]
[ 181.281628] RSP <ffff88183fd437a8>
[ 181.285666] ---[ end trace 5940288cde6a394e ]---
[ 181.290960] Kernel panic - not syncing: Fatal exception in interrupt
[ 181.298670] ------------[ cut here ]------------
[ 181.303913] WARNING: at /build/buildd/linux-3.10.0/arch/x86/kernel/smp.c:123 native_smp_send_reschedule+0x5a/0x60()
[ 181.315693] Modules linked in: 8021q(F) garp(F) stp(F) mrp(F) llc(F) bnx2x libcrc32c(F) sfc mdio mtd be2net intel_powerclamp coretemp kvm_intel kvm crc32_pclmul(F) ghash_clmulni_intel(F) aesni_intel(F) aes_x86_64(F) lrw(F) gf128mul(F) glue_helper(F) ablk_helper(F) cryptd(F) gpio_ich i7core_edac edac_core microcode(F) lpc_ich serio_raw(F) mac_hid squashfs(F) aufs igb i2c_algo_bit dca bcache(F) ptp(F) ahci(F) libahci(F) hpsa pps_core(F)
[ 181.361338] CPU: 22 PID: 24227 Comm: beam.smp Tainted: GF D I 3.10.0-3-generic #12-Ubuntu
[ 181.371270] Hardware name: HP ProLiant DL180 G6 , BIOS O20 09/01/2011
[ 181.378648] 0000000000000009 ffff88183fd43290 ffffffff816cdd3f ffff88183fd432c8
[ 181.387251] ffffffff81056c91 0000000000000000 ffff88183fd54480 00000000ffff8aef
[ 181.395852] ffff880c0fc14480 0000000000000016 ffff88183fd432d8 ffffffff81056d6a
[ 181.404452] Call Trace:
[ 181.407258] <IRQ> [<ffffffff816cdd3f>] dump_stack+0x19/0x1b
[ 181.413874] [<ffffffff81056c91>] warn_slowpath_common+0x61/0x80
[ 181.420668] [<ffffffff81056d6a>] warn_slowpath_null+0x1a/0x20
[ 181.427268] [<ffffffff81035e4a>] native_smp_send_reschedule+0x5a/0x60
[ 181.434649] [<ffffffff8109804b>] trigger_load_balance+0x16b/0x200
[ 181.441640] [<ffffffff8108b982>] scheduler_tick+0x102/0x150
[ 181.448048] [<ffffffff81068826>] update_process_times+0x66/0x80
[ 181.454846] [<ffffffff810b2725>] tick_sched_handle.isra.15+0x25/0x60
[ 181.462129] [<ffffffff810b2941>] tick_sched_timer+0x41/0x60
[ 181.468537] [<ffffffff810804c9>] __run_hrtimer+0x79/0x1d0
[ 181.474748] [<ffffffff810b2900>] ? tick_sched_do_timer+0x50/0x50
[ 181.481640] [<ffffffff81080ca7>] hrtimer_interrupt+0xf7/0x240
[ 181.488242] [<ffffffff816df129>] smp_apic_timer_interrupt+0x69/0x9c
[ 181.495427] [<ffffffff816ddf9d>] apic_timer_interrupt+0x6d/0x80
[ 181.502223] [<ffffffff816c7a1a>] ? panic+0x17a/0x1be
[ 181.507950] [<ffffffff816d5c83>] oops_end+0xe3/0xf0
[ 181.513582] [<ffffffff81015ddb>] die+0x4b/0x70
[ 181.518724] [<ffffffff816d5470>] do_trap+0x60/0x170
[ 181.524353] [<ffffffff810133d8>] do_invalid_op+0xa8/0xe0
[ 181.530468] [<ffffffffa01021e7>] ? be_xmit+0x937/0x990 [be2net]
[ 181.537265] [<ffffffff8108f365>] ? __vtime_account_system+0x35/0x40
[ 181.544450] [<ffffffff815bd7e7>] ? kfree_skbmem+0x37/0x90
[ 181.550661] [<ffffffff816de55e>] invalid_op+0x1e/0x30
[ 181.556483] [<ffffffff815bd7e7>] ? kfree_skbmem+0x37/0x90
[ 181.562695] [<ffffffffa01021e7>] ? be_xmit+0x937/0x990 [be2net]
[ 181.569482] [<ffffffffa0102043>] ? be_xmit+0x793/0x990 [be2net]
[ 181.576278] [<ffffffff815cd03f>] ? dev_queue_xmit_nit+0x1af/0x220
[ 181.583267] [<ffffffff815cf8d8>] dev_hard_start_xmit+0x318/0x560
[ 181.590159] [<ffffffff815ed086>] sch_direct_xmit+0xe6/0x1c0
[ 181.596564] [<ffffffff815cfd1e>] dev_queue_xmit+0x1fe/0x4c0
[ 181.602969] [<ffffffffa014dc4e>] vlan_dev_hard_start_xmit+0x8e/0x120 [8021q]
[ 181.611028] [<ffffffff815cf8d8>] dev_hard_start_xmit+0x318/0x560
[ 181.617921] [<ffffffff815cfe63>] dev_queue_xmit+0x343/0x4c0
[ 181.624327] [<ffffffff815bf8ae>] ? __alloc_skb+0x7e/0x2b0
[ 181.630531] [<ffffffff81605de9>] ip_finish_output+0x2c9/0x380
[ 181.637132] [<ffffffff81607188>] ip_output+0x58/0x90
[ 181.642857] [<ffffffff816068f5>] ip_local_out+0x25/0x30
[ 181.648874] [<ffffffff81606c4a>] ip_queue_xmit+0x14a/0x3f0
[ 181.655182] [<ffffffff8161e18d>] tcp_transmit_skb+0x44d/0x8b0
[ 181.661782] [<ffffffff81620dc4>] tcp_send_ack+0xa4/0xf0
[ 181.667799] [<ffffffff8161417e>] __tcp_ack_snd_check+0x5e/0xa0
[ 181.674497] [<ffffffff8161af0a>] tcp_rcv_established+0x3aa/0x8a0
[ 181.681390] [<ffffffff81625794>] tcp_v4_do_rcv+0x1b4/0x470
[ 181.687700] [<ffffffff81626db6>] tcp_v4_rcv+0x676/0x7d0
[ 181.693710] [<ffffffff816018d8>] ip_local_deliver_finish+0xe8/0x230
[ 181.700886] [<ffffffff81601bb7>] ip_local_deliver+0x47/0x80
[ 181.707293] [<ffffffff81601548>] ip_rcv_finish+0x78/0x320
[ 181.713496] [<ffffffff81601e09>] ip_rcv+0x219/0x360
[ 181.719125] [<ffffffff815cdce6>] __netif_receive_skb_core+0x656/0x820
[ 181.726504] [<ffffffff815cdec8>] __netif_receive_skb+0x18/0x60
[ 181.733202] [<ffffffff815cdf33>] netif_receive_skb+0x23/0x90
[ 181.739697] [<ffffffff815ce03c>] napi_gro_complete+0x9c/0xd0
[ 181.746199] [<ffffffff815ce0dd>] napi_gro_flush+0x6d/0x90
[ 181.752411] [<ffffffff815ce11e>] napi_complete+0x1e/0x50
[ 181.758525] [<ffffffffa0105703>] be_poll+0x7a3/0x830 [be2net]
[ 181.765126] [<ffffffff815ce26c>] net_rx_action+0x11c/0x240
[ 181.771435] [<ffffffff8105f6f7>] __do_softirq+0xf7/0x240
[ 181.777542] [<ffffffff8105f9ed>] irq_exit+0xcd/0xe0
[ 181.783171] [<ffffffff816df056>] do_IRQ+0x56/0xc0
[ 181.788605] [<ffffffff816d4d2d>] common_interrupt+0x6d/0x6d
[ 181.795010] <EOI>
[ 181.797208] ---[ end trace 5940288cde6a394f ]---

ProblemType: Bug
DistroRelease: Ubuntu 13.10
Package: linux-image-3.10.0-3-generic 3.10.0-3.12
ProcVersionSignature: Ubuntu 3.10.0-3.12-generic 3.10.1
Uname: Linux 3.10.0-3-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jul 17 12:20 seq
 crw-rw---- 1 root audio 116, 33 Jul 17 12:20 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
ApportVersion: 2.10.2-0ubuntu4
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: command ['iw', 'reg', 'get'] failed with exit code 1: nl80211 not found.
Date: Wed Jul 17 12:31:06 2013
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
MachineType: HP ProLiant DL180 G6
MarkForUpload: True
PciMultimedia:

ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcKernelCmdLine: initrd=netboot/initrd.img-3.10.0-3-generic boot=live debug quickreboot toram netboot modprobe.blacklist=be2net,e1000e,sfc,nouveau fetch=http://10.32.220.10/live/saucy.squashfs console=tty0 console=ttyS1,115200n81 label=precise clocksource=hpet live-netdev=eth0 ethdevice=eth0 ethdevice-timeout=120 BOOT_IMAGE=netboot/vmlinuz-3.10.0-3-generic
RelatedPackageVersions:
 linux-restricted-modules-3.10.0-3-generic N/A
 linux-backports-modules-3.10.0-3-generic N/A
 linux-firmware 1.112
RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 09/01/2011
dmi.bios.vendor: HP
dmi.bios.version: O20
dmi.chassis.type: 23
dmi.chassis.vendor: HP
dmi.modalias: dmi:bvnHP:bvrO20:bd09/01/2011:svnHP:pnProLiantDL180G6:pvr:cvnHP:ct23:cvr:
dmi.product.name: ProLiant DL180 G6
dmi.sys.vendor: HP

Chris Read (chris-read) wrote :

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed

Did this issue start happening after an update/upgrade? Was there a kernel version where you were not having this particular problem? This will help determine if the problem you are seeing is the result of the introduction of a regression, and when this regression was introduced. If this is a regression, we can perform a kernel bisect to identify the commit that introduced the problem.

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-da-key
Chris Read (chris-read) wrote :

Yes, this started happening after an upgrade.

Currently it all works well with kernel 3.5.0-26-generic in precise. This is the first time we're trying out our configuration on saucy.

I'll try get our system downgraded to 3.10.0-2 for testing...

Chris Read (chris-read) wrote :

Not able to try 3.10.0-2

Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.11 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.11-rc1-saucy/

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Chris Read (chris-read) on 2013-07-19
tags: added: kernel-unable-to-test-upstream
Chris Read (chris-read) wrote :

Just tried with kernel 3.10.0-5-generic #14-Ubuntu - same problem.

Chris Read (chris-read) wrote :

Just managed to test with kernel 3.11.0-031100rc2-generic #201307211535 - all is behaving as expected with no crash.

tags: added: kernel-fixed-upstream
removed: kernel-unable-to-test-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Chris Read (chris-read) wrote :

I just tested 3.10.0-031000-generic #201306301935 - this upstream build crashes so I guess the root cause is something in the 3.10 mainline...

summary: - Kernel panic under network load
+ Kernel BUG at ffffffffa01021e7 [verbose debug info unavailable]; RIP:
+ 0010:[<ffffffffa01021e7>] [<ffffffffa01021e7>] be_xmit+0x937/0x990
+ [be2net]
tags: added: regression-release
summary: - Kernel BUG at ffffffffa01021e7 [verbose debug info unavailable]; RIP:
- 0010:[<ffffffffa01021e7>] [<ffffffffa01021e7>] be_xmit+0x937/0x990
- [be2net]
+ 19a2:0710 Kernel BUG at ffffffffa01021e7 [verbose debug info
+ unavailable]; RIP: 0010:[<ffffffffa01021e7>] [<ffffffffa01021e7>]
+ be_xmit+0x937/0x990 [be2net]
tags: added: kernel-fixed-upstream-v3.11-rc1
removed: kernel-fixed-upstream
tags: added: needs-reverse-bisect
Joseph Salisbury (jsalisbury) wrote :

Per comment #8, this should be fixed in Saucy, since it has been since rebased to upstream 3.11. Can you apply the latest updates and see if this bug still exists or not?

Changed in linux (Ubuntu):
status: Confirmed → Fix Committed
Chris Read (chris-read) wrote :

Just tested with kernel 3.11.0-2-generic - no crash. Looking good...

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers