Comment 57 for bug 245779

Revision history for this message
Warren V (verbanista) wrote : Re: [Bug 245779] Re: Server 8.04 LTS: soft lockup - CPU#1 stuck for 11s! [bond1:3795] - bond - bond0

At one point I was able to track down a kernel.org post about the root of
the problem, but I can't remember exactly what it was. But I recall it was
due to a mistake on the part of one of the kernel devs.

-W

On Wed, Jul 22, 2009 at 2:36 PM, Ryan Lovett <email address hidden> wrote:

> On Wed, Jul 22, 2009 at 06:23:01PM -0000, Warren V wrote:
> > Upgrade your kernel to 2.6.28. CentOS is now on 2.6.28-128, I noted the
> > problem went away around 2.6.28-92.
> >
> > Ubuntu is stuck with whatever is currently out.
>
> Do you know which patch addressed the issue? If so, the Ubuntu kernel devs
> might be able to backport it to the LTS release.
>
> Ryan
>
> --
> Server 8.04 LTS: soft lockup - CPU#1 stuck for 11s! [bond1:3795] - bond -
> bond0
> https://bugs.launchpad.net/bugs/245779
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “linux” package in Ubuntu: Incomplete
> Status in “linux” package in Debian: Fix Released
>
> Bug description:
> Hi!
> Ubuntu Server 8.04 LTS with all patch and last kernel
> Hardware: HP DL360 G4 Xeon
> Bonding with :
> - bond0 2x1Gb Intel (802.3ad / 4)
> - bond1 8x1Gb Intel (802.3ad / 4)
> Nagios (only nrpe and plugin)
> Heartbeat2 (withour CRM)
> Vlan
>
> Today it crash (after two week uptime from kernel upgrade) with this output
>
> 6640927 firewall 11:46:54 kernel: [431168.944816] BUG: soft lockup - CPU#1
> stuck for 11s! [bond1:3795]
> 6640928 firewall 11:46:54 kernel: [431168.944849]
> 6640929 firewall 11:46:54 kernel: [431168.944853] Pid: 3795, comm: bond1
> Not tainted (2.6.24-19-server #1)
> 6640930 firewall 11:46:54 kernel: [431168.944856] EIP:
> 0060:[ipv6:_spin_lock+0xa/0x10] EFLAGS: 00000286 CPU: 1
> 6640931 firewall 11:46:54 kernel: [431168.944865] EIP is at
> _spin_lock+0xa/0x10
> 6640932 firewall 11:46:54 kernel: [431168.944867] EAX: f749f334 EBX:
> f749f25c ECX: 00000001 EDX: f749f25c
> 6640933 firewall 11:46:54 kernel: [431168.944870] ESI: 00000000 EDI:
> f7ca1000 EBP: f6c35c80 ESP: f6835cc0
> 6640934 firewall 11:46:54 kernel: [431168.944872] DS: 007b ES: 007b FS:
> 00d8 GS: 0000 SS: 0068
> 6640935 firewall 11:46:54 kernel: [431168.944875] CR0: 8005003b CR2:
> b7bfd0a0 CR3: 35908000 CR4: 000006b0
> 6640936 firewall 11:46:54 kernel: [431168.944878] DR0: 00000000 DR1:
> 00000000 DR2: 00000000 DR3: 00000000
> 6640937 firewall 11:46:54 kernel: [431168.944880] DR6: ffff0ff0 DR7:
> 00000400
> 6640938 firewall 11:46:54 kernel: [431168.944887] [<f8b67606>]
> ad_rx_machine+0x26/0x690 [bonding]
> 6640939 firewall 11:46:54 kernel: [431168.944899]
> [nf_nat:_read_lock_bh+0x8/0x50] _read_lock_bh+0x8/0x20
> 6640940 firewall 11:46:54 kernel: [431168.944920] [arp_process+0x8b/0x5f0]
> arp_process+0x8b/0x5f0
> 6640941 firewall 11:46:54 kernel: [431168.944930] [<f8b67e6a>]
> bond_3ad_lacpdu_recv+0x1fa/0x240 [bonding]
> 6640942 firewall 11:46:54 kernel: [431168.944946]
> [ip_local_deliver_finish+0xf9/0x210] ip_local_deliver_finish+0xf9/0x210
> 6640943 firewall 11:46:54 kernel: [431168.944955]
> [ip_rcv_finish+0xff/0x370] ip_rcv_finish+0xff/0x370
> 6640944 firewall 11:46:54 kernel: [431168.944960]
> [sock_def_write_space+0x12/0xa0] sock_def_write_space+0x12/0xa0
> 6640945 firewall 11:46:54 kernel: [431168.944968] [<f8967a4b>]
> e1000_alloc_rx_buffers+0xab/0x3a0 [e1000]
> 6640946 firewall 11:46:54 kernel: [431168.944982] [arp_rcv+0x0/0x140]
> arp_rcv+0x0/0x140
> 6640947 firewall 11:46:54 kernel: [431168.944994]
> [e1000:__netdev_alloc_skb+0x22/0x2a80] __netdev_alloc_skb+0x22/0x50
> 6640948 firewall 11:46:54 kernel: [431168.945000] [<f8b67c70>]
> bond_3ad_lacpdu_recv+0x0/0x240 [bonding]
> 6640949 firewall 11:46:54 kernel: [431168.945011]
> [tg3:netif_receive_skb+0x379/0x720] netif_receive_skb+0x379/0x440
> 6640950 firewall 11:46:54 kernel: [431168.945024] [<f8968474>]
> e1000_clean_rx_irq+0x174/0x500 [e1000]
> 6640951 firewall 11:46:54 kernel: [431168.945037] [<f8968378>]
> e1000_clean_rx_irq+0x78/0x500 [e1000]
> 6640952 firewall 11:46:54 kernel: [431168.945059] [<f8968300>]
> e1000_clean_rx_irq+0x0/0x500 [e1000]
> 6640953 firewall 11:46:54 kernel: [431168.945071] [<f896569e>]
> e1000_clean+0x5e/0x250 [e1000]
> 6640954 firewall 11:46:54 kernel: [431168.945085]
> [net_rx_action+0x12d/0x210] net_rx_action+0x12d/0x210
> 6640955 firewall 11:46:54 kernel: [431168.945099] [__do_softirq+0x82/0x110]
> __do_softirq+0x82/0x110
> 6640956 firewall 11:46:54 kernel: [431168.945109] [do_softirq+0x55/0x60]
> do_softirq+0x55/0x60
> 6640957 firewall 11:46:54 kernel: [431168.945113] [irq_exit+0x6d/0x80]
> irq_exit+0x6d/0x80
> 6640958 firewall 11:46:54 kernel: [431168.945117] [do_IRQ+0x40/0x70]
> do_IRQ+0x40/0x70
> 6640959 firewall 11:46:54 kernel: [431168.945121]
> [find_busiest_group+0x1bd/0x760] find_busiest_group+0x1bd/0x760
> 6640960 firewall 11:46:54 kernel: [431168.945130]
> [common_interrupt+0x23/0x28] common_interrupt+0x23/0x28
> 6640961 firewall 11:46:54 kernel: [431168.945142] [<f897007b>]
> e1000_init_hw+0x34b/0xb50 [e1000]
> 6640962 firewall 11:46:54 kernel: [431168.945156]
> [ipv6:_spin_lock+0x3/0x10] _spin_lock+0x3/0x10
> 6640963 firewall 11:46:54 kernel: [431168.945163] [<f8b67606>]
> ad_rx_machine+0x26/0x690 [bonding]
> 6640964 firewall 11:46:54 kernel: [431168.945179]
> [lock_timer_base+0x27/0x60] lock_timer_base+0x27/0x60
> 6640965 firewall 11:46:54 kernel: [431168.945183]
> [delayed_work_timer_fn+0x0/0x20] delayed_work_timer_fn+0x0/0x20
> 6640966 firewall 11:46:54 kernel: [431168.945194] [<f8b68290>]
> bond_3ad_state_machine_handler+0xf0/0x9b0 [bonding]
> 6640967 firewall 11:46:54 kernel: [431168.945206]
> [queue_delayed_work_on+0x7c/0xb0] queue_delayed_work_on+0x7c/0xb0
> 6640968 firewall 11:46:54 kernel: [431168.945214]
> [usbcore:queue_delayed_work+0x51/0x70] queue_delayed_work+0x51/0x70
> 6640969 firewall 11:46:54 kernel: [431168.945221] [<f8b681a0>]
> bond_3ad_state_machine_handler+0x0/0x9b0 [bonding]
> 6640970 firewall 11:46:54 kernel: [431168.945229]
> [run_workqueue+0xbf/0x160] run_workqueue+0xbf/0x160
> 6640971 firewall 11:46:54 kernel: [431168.945240] [worker_thread+0x0/0xe0]
> worker_thread+0x0/0xe0
> 6640972 firewall 11:46:54 kernel: [431168.945245] [worker_thread+0x84/0xe0]
> worker_thread+0x84/0xe0
> 6640973 firewall 11:46:54 kernel: [431168.945251] [<c0145fc0>]
> autoremove_wake_function+0x0/0x40
> 6640974 firewall 11:46:54 kernel: [431168.945260] [worker_thread+0x0/0xe0]
> worker_thread+0x0/0xe0
> 6640975 firewall 11:46:54 kernel: [431168.945265] [kthread+0x42/0x70]
> kthread+0x42/0x70
> 6640976 firewall 11:46:54 kernel: [431168.945269] [kthread+0x0/0x70]
> kthread+0x0/0x70
> 6640977 firewall 11:46:54 kernel: [431168.945274]
> [kernel_thread_helper+0x7/0x10] kernel_thread_helper+0x7/0x10
> 6640978 firewall 11:46:54 kernel: [431168.945284] =======================
>
> Can you help me?
>
> Very thanks
>
> ---
> Sim
>