32-bit ubuntu 7.10 hangs on boot waiting for rcu completion

Bug #319476 reported by Eli Collins
6
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned
linux-source-2.6.22 (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

Binary package hint: linux-image-2.6.22-14-server

I'm seeing hangs occasionally booting 32-bit ubuntu 7.10 in a VM (using 2.6.22-14-server and 2.6.22-16-server kernels, haven't tried other 7.10 kernels, I also see this hang when trying a 2.6.24-16 hardy kernel in the same VM). A child of the init process is waiting for an rcu completion that never gets executed. Per the following thread this has been seen running on native hardware.

http://lkml.org/lkml/2008/8/11/79

The thread mentions that "such freezes frequently occur due to the plain lack of timer interrupts.". That seems odd since these kernels are compiled with CONFIG_NO_HZ=y (ie no timer interrupts _should_ be delivered when idle). It seems like some code is missing a call to rcu_pending perhaps in module or networking code? I still see the hang when disabling the nic.

Here's a look at two instances of the bug from crash...

crash> bt
PID: 0 TASK: c03b7340 CPU: 0 COMMAND: "swapper"
 #0 [c03e7f38] schedule at c02f9dfe
 #1 [c03e7fb4] cpu_idle at c0102442
crash> ps
   PID PPID CPU TASK ST %MEM VSZ RSS COMM
> 0 0 0 c03b7340 RU 0.0 0 0 [swapper]
> 0 1 1 df84ea60 RU 0.0 0 0 [swapper]
      1 0 1 df84e000 IN 0.1 1748 584 init
      2 0 1 df84e530 IN 0.0 0 0 [kthreadd]
      3 2 0 df84ef90 IN 0.0 0 0 [migration/0]
      4 2 0 df84f4c0 IN 0.0 0 0 [ksoftirqd/0]
      5 2 0 df84f9f0 IN 0.0 0 0 [watchdog/0]
      6 2 1 df85e000 IN 0.0 0 0 [migration/1]
      7 2 1 df85e530 IN 0.0 0 0 [ksoftirqd/1]
      8 2 1 df85ea60 IN 0.0 0 0 [watchdog/1]
      9 2 0 df85ef90 IN 0.0 0 0 [events/0]
     10 2 1 df85f4c0 IN 0.0 0 0 [events/1]
     11 2 1 df85f9f0 IN 0.0 0 0 [khelper]
     31 2 0 df8e4000 IN 0.0 0 0 [kblockd/0]
     32 2 1 df8e4530 IN 0.0 0 0 [kblockd/1]
     33 2 0 df8e4a60 IN 0.0 0 0 [kacpid]
     34 2 1 df8e4f90 IN 0.0 0 0 [kacpi_notify]
    197 2 1 df8e54c0 IN 0.0 0 0 [kseriod]
    224 2 0 df8e59f0 IN 0.0 0 0 [pdflush]
    225 2 1 df930000 IN 0.0 0 0 [pdflush]
    226 2 0 df930530 IN 0.0 0 0 [kswapd0]
    278 2 0 df930a60 IN 0.0 0 0 [aio/0]
    279 2 1 df930f90 IN 0.0 0 0 [aio/1]
   2295 2 0 dff52530 IN 0.0 0 0 [ata/0]
   2296 2 1 dff52000 IN 0.0 0 0 [ata/1]
   2297 2 1 dfca94c0 IN 0.0 0 0 [ata_aux]
   2301 2 1 dfa68530 IN 0.0 0 0 [scsi_eh_0]
   2304 2 1 dfa68000 IN 0.0 0 0 [scsi_eh_1]
   2395 2 1 dff52f90 IN 0.0 0 0 [scsi_eh_2]
   2790 2 0 dff539f0 IN 0.0 0 0 [kjournald]
   2866 1 1 f7da8f90 IN 0.1 1756 560 rc
   2973 1 0 dfca8530 IN 0.1 2328 744 udevd
   4061 2 1 dfa699f0 IN 0.0 0 0 [kpsmoused]
   4334 2 1 dfb48530 IN 0.0 0 0 [kjournald]
   4419 2866 1 f79b6a60 IN 0.1 1756 552 S40networking
   4428 4419 1 f79c8000 IN 0.0 1692 488 ifup
   4439 4428 0 f79ea530 IN 0.0 1756 488 sh
   4440 4439 1 f79e6a60 UN 0.0 0 0 dhclient3
crash> bt 4440
PID: 4440 TASK: f79e6a60 CPU: 1 COMMAND: "dhclient3"
 #0 [f79f5e04] schedule at c02f9dfe
 #1 [f79f5e80] wait_for_completion at c02fa4ee
 #2 [f79f5ea4] synchronize_rcu at c0139875
 #3 [f79f5ec4] packet_release at f9b65f7c
 #4 [f79f5ef8] sock_release at c027d567
 #5 [f79f5f08] sock_close at c027d9f9
 #6 [f79f5f14] __fput at c0182fa7
 #7 [f79f5f34] filp_close at c01803d2
 #8 [f79f5f48] put_files_struct at c0129f3a
 #9 [f79f5f60] do_exit at c012b19c
#10 [f79f5fa4] do_group_exit at c012b871
#11 [f79f5fb4] sysenter_entry at c0104183
    EAX: 000000fc EBX: 00000001 ECX: 00000001 EDX: 080d7220
    DS: 007b ESI: b7f9d294 ES: 007b EDI: b7f9d294
    SS: 007b ESP: bfbfee0c EBP: bfbfee38
    CS: 0073 EIP: ffffe410 ERR: 000000fc EFLAGS: 00000246

crash> ps
   PID PPID CPU TASK ST %MEM VSZ RSS COMM
> 0 0 0 c03b7340 RU 0.0 0 0 [swapper]
> 0 1 1 df902a60 RU 0.0 0 0 [swapper]
> 0 1 2 df902f90 RU 0.0 0 0 [swapper]
> 0 1 3 df9034c0 RU 0.0 0 0 [swapper]
      1 0 0 df902000 IN 0.0 1120 356 init
      2 0 0 df902530 IN 0.0 0 0 [kthreadd]
      3 2 0 df9039f0 IN 0.0 0 0 [migration/0]
      4 2 0 df860000 IN 0.0 0 0 [ksoftirqd/0]
      5 2 0 df860530 IN 0.0 0 0 [watchdog/0]
      6 2 1 df860a60 IN 0.0 0 0 [migration/1]
      7 2 1 df860f90 IN 0.0 0 0 [ksoftirqd/1]
      8 2 1 df8614c0 IN 0.0 0 0 [watchdog/1]
      9 2 2 df8619f0 IN 0.0 0 0 [migration/2]
     10 2 2 df872000 IN 0.0 0 0 [ksoftirqd/2]
     11 2 2 df872530 IN 0.0 0 0 [watchdog/2]
     12 2 3 df872a60 IN 0.0 0 0 [migration/3]
     13 2 3 df872f90 IN 0.0 0 0 [ksoftirqd/3]
     14 2 3 df8734c0 IN 0.0 0 0 [watchdog/3]
     15 2 0 df8739f0 IN 0.0 0 0 [events/0]
     16 2 1 df884000 IN 0.0 0 0 [events/1]
     17 2 2 df884530 IN 0.0 0 0 [events/2]
     18 2 3 df884a60 IN 0.0 0 0 [events/3]
     19 2 2 df884f90 IN 0.0 0 0 [khelper]
     41 2 0 df8854c0 IN 0.0 0 0 [kblockd/0]
     42 2 1 df8859f0 IN 0.0 0 0 [kblockd/1]
     43 2 2 c2f0e000 IN 0.0 0 0 [kblockd/2]
     44 2 3 c2f0e530 IN 0.0 0 0 [kblockd/3]
     45 2 0 c2f0ea60 IN 0.0 0 0 [kacpid]
     46 2 3 c2f0ef90 IN 0.0 0 0 [kacpi_notify]
    106 2 0 c2f0f4c0 IN 0.0 0 0 [kseriod]
    141 2 0 c2f0f9f0 IN 0.0 0 0 [pdflush]
    142 2 0 df950000 IN 0.0 0 0 [pdflush]
    143 2 0 df950530 IN 0.0 0 0 [kswapd0]
    195 2 0 df950a60 IN 0.0 0 0 [aio/0]
    196 2 1 df950f90 IN 0.0 0 0 [aio/1]
    197 2 2 df9514c0 IN 0.0 0 0 [aio/2]
    198 2 3 df9519f0 IN 0.0 0 0 [aio/3]
   1075 1 2 dfedaa60 IN 0.0 1120 204 init
   1119 1075 0 dfce2f90 UN 0.0 1608 568 modprobe
crash> bt 1119
PID: 1119 TASK: dfce2f90 CPU: 0 COMMAND: "modprobe"
 #0 [df9b1e08] schedule at c02f9dfe
 #1 [df9b1e84] wait_for_completion at c02fa4ee
 #2 [df9b1ea8] synchronize_rcu at c0139875
 #3 [df9b1ec8] sys_init_module at c014a61e
 #4 [df9b1fb4] system_call at c01041fb
    EAX: 00000080 EBX: b7e51000 ECX: 00001ef8 EDX: 08051bf0
    DS: 007b ESI: 08051bf0 ES: 007b EDI: 08052ea0
    SS: 007b ESP: bf923b30 EBP: bf923bb8
    CS: 0073 EIP: b7f16b8e ERR: 00000080 EFLAGS: 00000246

Tags: kj-expired
Revision history for this message
Eli Collins (elicollins) wrote :
Revision history for this message
Sergio Zanchetta (primes2h) wrote :

The 18 month support period for Gutsy Gibbon 7.10 has reached its end of life -
http://www.ubuntu.com/news/ubuntu-7.10-eol . As a result, we are closing the
linux-source-2.6.22 kernel task. It would be helpful if you could test the
new Jaunty Jackalope 9.04 release and confirm if this issue remains -
http://www.ubuntu.com/getubuntu/releasenotes/904overview. If the issue still exists with the Jaunty
release, please update this report by changing the Status of the "linux (Ubuntu)"
task from "Incomplete" to "New". Also please be sure to run the command below
which will automatically gather and attach updated debug information to this
report. Thanks in advance.

apport-collect -p linux-image-2.6.28-11-generic 319476

Changed in linux-source-2.6.22 (Ubuntu):
status: New → Won't Fix
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

This bug report was marked as Incomplete and has not had any updated comments for quite some time. As a result this bug is being closed. Please reopen if this is still an issue in the current Ubuntu release http://www.ubuntu.com/getubuntu/download . Also, please be sure to provide any requested information that may have been missing. To reopen the bug, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-expired
Changed in linux (Ubuntu):
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.