.27 CPU#1 stuck for 113s! [events/1:10]

Bug #262437 reported by Carl Karsten
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Chris Coulson

Bug Description

Binary package hint: linux-image-2.6.27-1-generic

carl@dv67:~$ uname -a
Linux dv67 2.6.27-1-generic #1 SMP Sat Aug 23 23:20:09 UTC 2008 i686 GNU/Linux

having various issues booting. here is one:

kernel /boot/vmlinuz-2.6.27-1-generic root=UUID=270934eb-b4c5-4f17-87e4-8ebea9d98ed3 ro vga=6 notsc

full dmesg and dmidecode attached - here is what might be interesting:

[ 3.309004] hub 2-0:1.0: USB hub found
[ 3.309004] hub 2-0:1.0: 2 ports detected
[ 124.028559] BUG: soft lockup - CPU#1 stuck for 113s! [events/1:10]
[ 124.028559] Modules linked in: ohci1394(+) ata_generic(+) ieee1394 forcedeth(+) libata scsi_mod ohci_hcd(+) ehci_hcd(+) dock usbcore thermal processor fan uvesafb cn fuse fbcon tileblit font bitblit softcursor
[ 124.028559]
[ 124.028559] Pid: 10, comm: events/1 Not tainted (2.6.27-1-generic #1)
[ 124.028559] EIP: 0060:[<c0329e48>] EFLAGS: 00000202 CPU: 1
[ 124.028559] EIP is at rt_check_expire+0x88/0x1f0
[ 124.028559] EAX: f707c000 EBX: f6b84408 ECX: 00001999 EDX: 00000000
[ 124.028559] ESI: 000124f8 EDI: 00001102 EBP: f707df64 ESP: f707df44
[ 124.028559] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 124.028559] CR0: 8005003b CR2: b7de7000 CR3: 365ef000 CR4: 00000690
[ 124.028559] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 124.028559] DR6: ffff0ff0 DR7: 00000400
[ 124.028559] [<c0329fb0>] ? rt_worker_func+0x0/0x20
[ 124.028559] [<c0329fb0>] ? rt_worker_func+0x0/0x20
[ 124.028559] [<c0329fbd>] rt_worker_func+0xd/0x20
[ 124.028559] [<c01432b5>] run_workqueue+0x95/0x160
[ 124.028559] [<c01472f6>] ? prepare_to_wait+0x46/0x70
[ 124.028559] [<c0143558>] worker_thread+0x88/0xf0
[ 124.028559] [<c0147120>] ? autoremove_wake_function+0x0/0x50
[ 124.028559] [<c01434d0>] ? worker_thread+0x0/0xf0
[ 124.028559] [<c0146db1>] kthread+0x41/0x80
[ 124.028559] [<c0146d70>] ? kthread+0x0/0x80
[ 124.032559] [<c0105287>] kernel_thread_helper+0x7/0x10
[ 124.032559] =======================
[ 124.032559] BUG: soft lockup - CPU#0 stuck for 113s! [swapper:0]
[ 124.032559] Modules linked in: ohci1394(+) ata_generic(+) ieee1394 forcedeth(+) libata scsi_mod ohci_hcd(+) ehci_hcd(+) dock usbcore thermal processor fan uvesafb cn fuse fbcon tileblit font bitblit softcursor
[ 124.032559]
[ 124.032559] Pid: 0, comm: swapper Not tainted (2.6.27-1-generic #1)
[ 124.032559] EIP: 0060:[<c011ab75>] EFLAGS: 00000246 CPU: 0
[ 124.032559] EIP is at native_safe_halt+0x5/0x10
[ 124.032559] EAX: c04c2000 EBX: 00d70596 ECX: f7154800 EDX: 00001002
[ 124.032559] ESI: f7154bdc EDI: f715481c EBP: c04c3f78 ESP: c04c3f78
[ 124.032559] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 124.032559] CR0: 8005003b CR2: 0894ade0 CR3: 370e1000 CR4: 00000690
[ 124.032559] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 124.032559] DR6: ffff0ff0 DR7: 00000400
[ 124.032559] [<f88734ac>] acpi_idle_do_entry+0x37/0x5c [processor]
[ 124.032559] [<f8873568>] acpi_idle_enter_c1+0x97/0xb8 [processor]
[ 124.032559] [<c02dc4eb>] cpuidle_idle_call+0x7b/0xd0
[ 124.032559] [<c010288d>] cpu_idle+0x7d/0x130
[ 124.032559] [<c0384f13>] rest_init+0x53/0x60
[ 124.032559] =======================
[ 124.271966] ACPI: PCI Interrupt Link [LUS0] enabled at IRQ 18
[ 124.271966] ohci_hcd 0000:00:02.0: PCI INT A -> Link[LUS0] -> GSI 18 (level, low) -> IRQ 18
[ 124.271966] ohci_hcd 0000:00:02.0: setting latency timer to 64
[ 124.271966] ohci_hcd 0000:00:02.0: OHCI Host Controller
[ 124.271966] ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 3
[ 124.271966] ohci_hcd 0000:00:02.0: irq 18, io mem 0xf6486000
[ 124.335966] usb usb3: configuration #1 chosen from 1 choice
[ 124.335966] hub 3-0:1.0: USB hub found
[ 124.335966] hub 3-0:1.0: 7 ports detected
[ 124.459966] ACPI: PCI Interrupt Link [Z018] enabled at IRQ 18

Revision history for this message
Carl Karsten (carlfk) wrote :
Revision history for this message
Carl Karsten (carlfk) wrote :
Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better.

Please could you run the command "sudo lspci -vvnn > lspci-vvnn.log" and attach the resulting file "lspci-vvnn.log" to this bug report.

Thanks in advance.

Changed in linux:
assignee: nobody → chrisccoulson
status: New → Incomplete
Revision history for this message
Carl Karsten (carlfk) wrote :
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Carl,

Can you comment if this existed with the 2.6.26 kernel or is this specifically a regression from 2.6.26 to 2.6.27. You should be able to test teh 2.6.26 kernel with the Alpha4 Intrepid release - http://www.ubuntu.com/testing . Thanks.

Revision history for this message
Stefan Bader (smb) wrote :

Is there any NFS mount involved (since the first block seems to occur in a network function and there has been a change upstream for that)?

Otherwise, does "noacpitimer" change something here?

Revision history for this message
Marques Johansson (marques) wrote :

I did not have this CPU stuck problem with 2.6.26-5, but I am seeing it with 2.6.27-1 and 2.6.27-2.

I have tried this kernel on other systems and haven't seen this problem (an older AMD 2200XP+ or on an 3ghz Intel P4-HT).

I'm attaching the dmesg output from 2.6.27-2 and 2.6.26-5 from my AMDx2-64 that is having this CPU stuck problem.

Revision history for this message
Marques Johansson (marques) wrote :

I did not have this CPU stuck problem with 2.6.26-5, but I am seeing it with 2.6.27-1 and 2.6.27-2.

I have tried this kernel on other systems and haven't seen this problem (an older AMD 2200XP+ or on an 3ghz Intel P4-HT).

I'm attaching the dmesg output from 2.6.27-2 and 2.6.26-5 from my AMDx2-64 that is having this CPU stuck problem.

Revision history for this message
Marques Johansson (marques) wrote :
Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Marques - Could you please report your issue in to a separate bug report, as it is a different problem to the one decribed here.

Thanks

Revision history for this message
Marques Johansson (marques) wrote :

I already tried posting at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/254668 (ACPI hangs on boot). I think my problem is the one described here in this bug. "swapper" is getting hung for me as the original poster describes in his dmesg log.

I booted with noapictimer and did not get the CPU stuck errors / delays.

I'm attaching my dmesg log from the noapictimer boot-up. Let me know if you still think this is unrelated.

Revision history for this message
Stefan Bader (smb) wrote :

@Marques,

your problem might be related to a bug found in the hpet code. That should actually work with the kernel at http://people.ubuntu.com/~smb/bug254668/linux-image-2.6.27-3-generic_2.6.27-3.3smb2_i386.deb

There were also other problems in related places and I am currently building a kernel with all of them in. But maybe you have the time to check whether this single patch already helps in your case.

Revision history for this message
Marques Johansson (marques) wrote :
Download full text (3.9 KiB)

I'm running the amd64 arch, so that kernel won't work here.

I can boot with "noapictimer", and also with "nohz=off highres=off apci=debug debug". I'm attaching the dmesg from the latter boot options. (highres=off did not work by itself, not did nohz=off )

Here are the contents of some other files I've heard discussed in these threads (during the same boot as the dmesg, sorry I couldn't wait to capture these files on a CPU stuck boot):

displague@split:~$ cat /sys/devices/system/clocksource/clocksource0/*
hpet acpi_pm jiffies tsc
hpet

displague@split:~$ cat /proc/timer_list
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 474426458770 nsecs

cpu: 0
 clock 0:
  .index: 0
  .resolution: 4000250 nsecs
  .get_time: ktime_get_real
  .offset: 0 nsecs
active timers:
 clock 1:
  .index: 1
  .resolution: 4000250 nsecs
  .get_time: ktime_get
  .offset: 0 nsecs
active timers:
 #0: <ffff88001943bd48>, hrtimer_wakeup, S:01, futex_wait, firefox/6746
 # expires at 474458683930 nsecs [in 32225160 nsecs]
 #1: <ffff88001943bd48>, it_real_fn, S:01, do_setitimer, syslogd/4925
 # expires at 487622761050 nsecs [in 13196302280 nsecs]
 #2: <ffff88001943bd48>, hrtimer_wakeup, S:01, futex_wait, firefox/6883
 # expires at 518020472610 nsecs [in 43594013840 nsecs]
 #3: <ffff88001943bd48>, hrtimer_wakeup, S:01, do_nanosleep, atd/5528
 # expires at 3653344476130 nsecs [in 3178918017360 nsecs]
  .expires_next : 9223372036854775807 nsecs
  .hres_active : 0
  .nr_events : 0
  .nohz_mode : 0
  .idle_tick : 0 nsecs
  .tick_stopped : 0
  .idle_jiffies : 0
  .idle_calls : 0
  .idle_sleeps : 0
  .idle_entrytime : 474421270370 nsecs
  .idle_waketime : 0 nsecs
  .idle_exittime : 0 nsecs
  .idle_sleeptime : 437221753573 nsecs
  .last_jiffies : 0
  .next_jiffies : 0
  .idle_expires : 0 nsecs
jiffies: 4295010841

cpu: 1
 clock 0:
  .index: 0
  .resolution: 4000250 nsecs
  .get_time: ktime_get_real
  .offset: 0 nsecs
active timers:
 clock 1:
  .index: 1
  .resolution: 4000250 nsecs
  .get_time: ktime_get
  .offset: 0 nsecs
active timers:
 #0: <ffff88001943bd48>, hrtimer_wakeup, S:01, futex_wait, gvfsd-trash/6912
 # expires at 474692183810 nsecs [in 265725040 nsecs]
 #1: <ffff88001943bd48>, hrtimer_wakeup, S:01, futex_wait, firefox/6874
 # expires at 492439331930 nsecs [in 18012873160 nsecs]
 #2: <ffff88001943bd48>, hrtimer_wakeup, S:01, do_nanosleep, cron/5558
 # expires at 519830787730 nsecs [in 45404328960 nsecs]
  .expires_next : 9223372036854775807 nsecs
  .hres_active : 0
  .nr_events : 0
  .nohz_mode : 0
  .idle_tick : 0 nsecs
  .tick_stopped : 0
  .idle_jiffies : 0
  .idle_calls : 0
  .idle_sleeps : 0
  .idle_entrytime : 474426547050 nsecs
  .idle_waketime : 0 nsecs
  .idle_exittime : 0 nsecs
  .idle_sleeptime : 395247480831 nsecs
  .last_jiffies : 0
  .next_jiffies : 0
  .idle_expires : 0 nsecs
jiffies: 4295010841

Tick Device: mode: 0
Clock Event Device: hpet
 max_delta_ns: 85899346200
 min_delta_ns: 1920
 mult: 107374182
 shift: 32
 mode: 2
 next_event: 0 nsecs
 set_next_event: hpet_legacy_next_even...

Read more...

Revision history for this message
Stefan Bader (smb) wrote :

I just put a amd64 bit version of the smb5 kernel online.

Revision history for this message
Marques Johansson (marques) wrote :

Your smb5 kernel build boots in a normal amount of time and seems to be functioning well. I'm attaching the dmesg output.

Revision history for this message
Stefan Bader (smb) wrote :

I am marking this bug as fix committed since Intrepid has been rebased to 2.6.27-rc6 which contains the fixes.

Changed in linux:
importance: Undecided → Medium
milestone: none → intrepid-alpha-6
status: Incomplete → Fix Committed
Revision history for this message
C de-Avillez (hggdh2) wrote :

I cannot experience this anymore with 2.6.27-3.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Setting this to "Fix Released" based on the decision to stick with the 2.6.27 kernel for Intrepid - https://lists.ubuntu.com/archives/kernel-team/2008-September/003107.html

Changed in linux:
status: Fix Committed → Fix Released
Revision history for this message
mlissner (mlissner-michaeljaylissner) wrote :

I appear to be encountering this bug on 2.6.27-7. Pretty frequently, certain applications hang, seemingly at random. I've been trying to keep my eye on them in the System Monitor, and whenever it happens, the waiting channel reports "futex_wait".

I have been searching a bit for a fix for this error, which began when I updated to Intrepid. So far, I've come up with little.

I also am encountering problems with hrtimer_nanosleep, which causes gksu to fail. Not sure if that's related.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.