Ubuntu

.27 CPU#1 stuck for 113s! [events/1:10]

Reported by Carl Karsten on 2008-08-28
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Chris Coulson

Bug Description

Binary package hint: linux-image-2.6.27-1-generic

carl@dv67:~$ uname -a
Linux dv67 2.6.27-1-generic #1 SMP Sat Aug 23 23:20:09 UTC 2008 i686 GNU/Linux

having various issues booting. here is one:

kernel /boot/vmlinuz-2.6.27-1-generic root=UUID=270934eb-b4c5-4f17-87e4-8ebea9d98ed3 ro vga=6 notsc

full dmesg and dmidecode attached - here is what might be interesting:

[ 3.309004] hub 2-0:1.0: USB hub found
[ 3.309004] hub 2-0:1.0: 2 ports detected
[ 124.028559] BUG: soft lockup - CPU#1 stuck for 113s! [events/1:10]
[ 124.028559] Modules linked in: ohci1394(+) ata_generic(+) ieee1394 forcedeth(+) libata scsi_mod ohci_hcd(+) ehci_hcd(+) dock usbcore thermal processor fan uvesafb cn fuse fbcon tileblit font bitblit softcursor
[ 124.028559]
[ 124.028559] Pid: 10, comm: events/1 Not tainted (2.6.27-1-generic #1)
[ 124.028559] EIP: 0060:[<c0329e48>] EFLAGS: 00000202 CPU: 1
[ 124.028559] EIP is at rt_check_expire+0x88/0x1f0
[ 124.028559] EAX: f707c000 EBX: f6b84408 ECX: 00001999 EDX: 00000000
[ 124.028559] ESI: 000124f8 EDI: 00001102 EBP: f707df64 ESP: f707df44
[ 124.028559] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 124.028559] CR0: 8005003b CR2: b7de7000 CR3: 365ef000 CR4: 00000690
[ 124.028559] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 124.028559] DR6: ffff0ff0 DR7: 00000400
[ 124.028559] [<c0329fb0>] ? rt_worker_func+0x0/0x20
[ 124.028559] [<c0329fb0>] ? rt_worker_func+0x0/0x20
[ 124.028559] [<c0329fbd>] rt_worker_func+0xd/0x20
[ 124.028559] [<c01432b5>] run_workqueue+0x95/0x160
[ 124.028559] [<c01472f6>] ? prepare_to_wait+0x46/0x70
[ 124.028559] [<c0143558>] worker_thread+0x88/0xf0
[ 124.028559] [<c0147120>] ? autoremove_wake_function+0x0/0x50
[ 124.028559] [<c01434d0>] ? worker_thread+0x0/0xf0
[ 124.028559] [<c0146db1>] kthread+0x41/0x80
[ 124.028559] [<c0146d70>] ? kthread+0x0/0x80
[ 124.032559] [<c0105287>] kernel_thread_helper+0x7/0x10
[ 124.032559] =======================
[ 124.032559] BUG: soft lockup - CPU#0 stuck for 113s! [swapper:0]
[ 124.032559] Modules linked in: ohci1394(+) ata_generic(+) ieee1394 forcedeth(+) libata scsi_mod ohci_hcd(+) ehci_hcd(+) dock usbcore thermal processor fan uvesafb cn fuse fbcon tileblit font bitblit softcursor
[ 124.032559]
[ 124.032559] Pid: 0, comm: swapper Not tainted (2.6.27-1-generic #1)
[ 124.032559] EIP: 0060:[<c011ab75>] EFLAGS: 00000246 CPU: 0
[ 124.032559] EIP is at native_safe_halt+0x5/0x10
[ 124.032559] EAX: c04c2000 EBX: 00d70596 ECX: f7154800 EDX: 00001002
[ 124.032559] ESI: f7154bdc EDI: f715481c EBP: c04c3f78 ESP: c04c3f78
[ 124.032559] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 124.032559] CR0: 8005003b CR2: 0894ade0 CR3: 370e1000 CR4: 00000690
[ 124.032559] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 124.032559] DR6: ffff0ff0 DR7: 00000400
[ 124.032559] [<f88734ac>] acpi_idle_do_entry+0x37/0x5c [processor]
[ 124.032559] [<f8873568>] acpi_idle_enter_c1+0x97/0xb8 [processor]
[ 124.032559] [<c02dc4eb>] cpuidle_idle_call+0x7b/0xd0
[ 124.032559] [<c010288d>] cpu_idle+0x7d/0x130
[ 124.032559] [<c0384f13>] rest_init+0x53/0x60
[ 124.032559] =======================
[ 124.271966] ACPI: PCI Interrupt Link [LUS0] enabled at IRQ 18
[ 124.271966] ohci_hcd 0000:00:02.0: PCI INT A -> Link[LUS0] -> GSI 18 (level, low) -> IRQ 18
[ 124.271966] ohci_hcd 0000:00:02.0: setting latency timer to 64
[ 124.271966] ohci_hcd 0000:00:02.0: OHCI Host Controller
[ 124.271966] ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 3
[ 124.271966] ohci_hcd 0000:00:02.0: irq 18, io mem 0xf6486000
[ 124.335966] usb usb3: configuration #1 chosen from 1 choice
[ 124.335966] hub 3-0:1.0: USB hub found
[ 124.335966] hub 3-0:1.0: 7 ports detected
[ 124.459966] ACPI: PCI Interrupt Link [Z018] enabled at IRQ 18

Carl Karsten (carlfk) wrote :
Carl Karsten (carlfk) wrote :
Chris Coulson (chrisccoulson) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better.

Please could you run the command "sudo lspci -vvnn > lspci-vvnn.log" and attach the resulting file "lspci-vvnn.log" to this bug report.

Thanks in advance.

Changed in linux:
assignee: nobody → chrisccoulson
status: New → Incomplete
Carl Karsten (carlfk) wrote :

Hi Carl,

Can you comment if this existed with the 2.6.26 kernel or is this specifically a regression from 2.6.26 to 2.6.27. You should be able to test teh 2.6.26 kernel with the Alpha4 Intrepid release - http://www.ubuntu.com/testing . Thanks.

Stefan Bader (smb) wrote :

Is there any NFS mount involved (since the first block seems to occur in a network function and there has been a change upstream for that)?

Otherwise, does "noacpitimer" change something here?

Marques Johansson (marques) wrote :

I did not have this CPU stuck problem with 2.6.26-5, but I am seeing it with 2.6.27-1 and 2.6.27-2.

I have tried this kernel on other systems and haven't seen this problem (an older AMD 2200XP+ or on an 3ghz Intel P4-HT).

I'm attaching the dmesg output from 2.6.27-2 and 2.6.26-5 from my AMDx2-64 that is having this CPU stuck problem.

Marques Johansson (marques) wrote :

I did not have this CPU stuck problem with 2.6.26-5, but I am seeing it with 2.6.27-1 and 2.6.27-2.

I have tried this kernel on other systems and haven't seen this problem (an older AMD 2200XP+ or on an 3ghz Intel P4-HT).

I'm attaching the dmesg output from 2.6.27-2 and 2.6.26-5 from my AMDx2-64 that is having this CPU stuck problem.

Marques Johansson (marques) wrote :
Chris Coulson (chrisccoulson) wrote :

Marques - Could you please report your issue in to a separate bug report, as it is a different problem to the one decribed here.

Thanks

Marques Johansson (marques) wrote :

I already tried posting at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/254668 (ACPI hangs on boot). I think my problem is the one described here in this bug. "swapper" is getting hung for me as the original poster describes in his dmesg log.

I booted with noapictimer and did not get the CPU stuck errors / delays.

I'm attaching my dmesg log from the noapictimer boot-up. Let me know if you still think this is unrelated.

Stefan Bader (smb) wrote :

@Marques,

your problem might be related to a bug found in the hpet code. That should actually work with the kernel at http://people.ubuntu.com/~smb/bug254668/linux-image-2.6.27-3-generic_2.6.27-3.3smb2_i386.deb

There were also other problems in related places and I am currently building a kernel with all of them in. But maybe you have the time to check whether this single patch already helps in your case.

Marques Johansson (marques) wrote :
Download full text (3.9 KiB)

I'm running the amd64 arch, so that kernel won't work here.

I can boot with "noapictimer", and also with "nohz=off highres=off apci=debug debug". I'm attaching the dmesg from the latter boot options. (highres=off did not work by itself, not did nohz=off )

Here are the contents of some other files I've heard discussed in these threads (during the same boot as the dmesg, sorry I couldn't wait to capture these files on a CPU stuck boot):

displague@split:~$ cat /sys/devices/system/clocksource/clocksource0/*
hpet acpi_pm jiffies tsc
hpet

displague@split:~$ cat /proc/timer_list
Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 474426458770 nsecs

cpu: 0
 clock 0:
  .index: 0
  .resolution: 4000250 nsecs
  .get_time: ktime_get_real
  .offset: 0 nsecs
active timers:
 clock 1:
  .index: 1
  .resolution: 4000250 nsecs
  .get_time: ktime_get
  .offset: 0 nsecs
active timers:
 #0: <ffff88001943bd48>, hrtimer_wakeup, S:01, futex_wait, firefox/6746
 # expires at 474458683930 nsecs [in 32225160 nsecs]
 #1: <ffff88001943bd48>, it_real_fn, S:01, do_setitimer, syslogd/4925
 # expires at 487622761050 nsecs [in 13196302280 nsecs]
 #2: <ffff88001943bd48>, hrtimer_wakeup, S:01, futex_wait, firefox/6883
 # expires at 518020472610 nsecs [in 43594013840 nsecs]
 #3: <ffff88001943bd48>, hrtimer_wakeup, S:01, do_nanosleep, atd/5528
 # expires at 3653344476130 nsecs [in 3178918017360 nsecs]
  .expires_next : 9223372036854775807 nsecs
  .hres_active : 0
  .nr_events : 0
  .nohz_mode : 0
  .idle_tick : 0 nsecs
  .tick_stopped : 0
  .idle_jiffies : 0
  .idle_calls : 0
  .idle_sleeps : 0
  .idle_entrytime : 474421270370 nsecs
  .idle_waketime : 0 nsecs
  .idle_exittime : 0 nsecs
  .idle_sleeptime : 437221753573 nsecs
  .last_jiffies : 0
  .next_jiffies : 0
  .idle_expires : 0 nsecs
jiffies: 4295010841

cpu: 1
 clock 0:
  .index: 0
  .resolution: 4000250 nsecs
  .get_time: ktime_get_real
  .offset: 0 nsecs
active timers:
 clock 1:
  .index: 1
  .resolution: 4000250 nsecs
  .get_time: ktime_get
  .offset: 0 nsecs
active timers:
 #0: <ffff88001943bd48>, hrtimer_wakeup, S:01, futex_wait, gvfsd-trash/6912
 # expires at 474692183810 nsecs [in 265725040 nsecs]
 #1: <ffff88001943bd48>, hrtimer_wakeup, S:01, futex_wait, firefox/6874
 # expires at 492439331930 nsecs [in 18012873160 nsecs]
 #2: <ffff88001943bd48>, hrtimer_wakeup, S:01, do_nanosleep, cron/5558
 # expires at 519830787730 nsecs [in 45404328960 nsecs]
  .expires_next : 9223372036854775807 nsecs
  .hres_active : 0
  .nr_events : 0
  .nohz_mode : 0
  .idle_tick : 0 nsecs
  .tick_stopped : 0
  .idle_jiffies : 0
  .idle_calls : 0
  .idle_sleeps : 0
  .idle_entrytime : 474426547050 nsecs
  .idle_waketime : 0 nsecs
  .idle_exittime : 0 nsecs
  .idle_sleeptime : 395247480831 nsecs
  .last_jiffies : 0
  .next_jiffies : 0
  .idle_expires : 0 nsecs
jiffies: 4295010841

Tick Device: mode: 0
Clock Event Device: hpet
 max_delta_ns: 85899346200
 min_delta_ns: 1920
 mult: 107374182
 shift: 32
 mode: 2
 next_event: 0 nsecs
 set_next_event: hpet_legacy_next_even...

Read more...

Stefan Bader (smb) wrote :

I just put a amd64 bit version of the smb5 kernel online.

Marques Johansson (marques) wrote :

Your smb5 kernel build boots in a normal amount of time and seems to be functioning well. I'm attaching the dmesg output.

Stefan Bader (smb) wrote :

I am marking this bug as fix committed since Intrepid has been rebased to 2.6.27-rc6 which contains the fixes.

Changed in linux:
importance: Undecided → Medium
milestone: none → intrepid-alpha-6
status: Incomplete → Fix Committed
C de-Avillez (hggdh2) wrote :

I cannot experience this anymore with 2.6.27-3.

Setting this to "Fix Released" based on the decision to stick with the 2.6.27 kernel for Intrepid - https://lists.ubuntu.com/archives/kernel-team/2008-September/003107.html

Changed in linux:
status: Fix Committed → Fix Released

I appear to be encountering this bug on 2.6.27-7. Pretty frequently, certain applications hang, seemingly at random. I've been trying to keep my eye on them in the System Monitor, and whenever it happens, the waiting channel reports "futex_wait".

I have been searching a bit for a fix for this error, which began when I updated to Intrepid. So far, I've come up with little.

I also am encountering problems with hrtimer_nanosleep, which causes gksu to fail. Not sure if that's related.

To post a comment you must log in.