Kernel race condition if nfs mounts present on real or virtual nodes [kernel BUG at lib/radix-tree.c]

Bug #58170 reported by Herbert Straub
20
Affects Status Importance Assigned to Milestone
linux-source-2.6.15 (Ubuntu)
Medium
Unassigned

Bug Description

Binary package hint: linux-image-2.6-686

Situation: all virtual vmware machines are on a nfs mounted filesystem. One vmware machine goes unuseable and comsume all the CPU time. A kill -KIL pid doesn't wipe out the vmware process. A ls -lh on the nfs mount hangs.

I'm using Dapper Drake, details:
2.6.15-26-686 #1 SMP PREEMPT Thu Aug 3 03:13:28 UTC 2006 i686 GNU/Linux
dpkg -l nfs-common
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Installed/Config-files/Unpacked/Failed-config/Half-installed
|/ Err?=(none)/Hold/Reinst-required/X=both-problems (Status,Err: uppercase=bad)
||/ Name Version Description
+++-==============-==============-============================================
ii nfs-common 1.0.7-3ubuntu2 NFS support files common to client and serve

I found in the /var/log/kern.log the following sequenze:

kernel: [17758077.728000] ------------[ cut here ]------------
kernel: [17758077.728000] kernel BUG at lib/radix-tree.c:372!
kernel: [17758077.728000] invalid operand: 0000 [#1]
kernel: [17758077.728000] PREEMPT SMP
kernel: [17758077.728000] Modules linked in: vmnet parport_pc vmmon nfs lockd sunrpc ipv6 md_mod dm_mod lp parport ide_disk tsdev serio_raw psmouse i2c_piix4 i2c_core floppy cfi_probe gen_probe pcspkr scb2_flash mtdcore chipreg tg3 sworks_agp agpgart map_funcs evdev ext3 jbd ide_generic ohci_hcd usbcore ide_cd cdrom serverworks generic cciss scsi_mod thermal processor fan capability commoncap vga16fb vgastate fbcon tileblit font bitblit softcursor
kernel: [17758077.728000] CPU: 0
kernel: [17758077.728000] EIP: 0060:[radix_tree_tag_set+147/160] Tainted: P VLI
kernel: [17758077.728000] EFLAGS: 00013046 (2.6.15-26-686)
kernel: [17758077.728000] EIP is at radix_tree_tag_set+0x93/0xa0
kernel: [17758077.728000] eax: 00000000 ebx: 00000000 ecx: f5ba81a0 edx: 00000006
kernel: [17758077.728000] esi: 00000000 edi: 00000004 ebp: 00000003 esp: ee27fd10
kernel: [17758077.728000] ds: 007b es: 007b ss: 0068
kernel: [17758077.728000] Process vmware-vmx (pid: 25602, threadinfo=ee27e000 task=ed392a90)
kernel: [17758077.728000] Stack: 00000008 c2a5da84 ef6ac730 00000000 ef6ac740 c015766c ef6ac734 00005b86
kernel: [17758077.728000] 00000001 00003202 e2e90c54 e2e90c60 ef5e84c0 ee27fd90 f8cf835d c2a5da84
kernel: [17758077.728000] 00000050 f8cf726d ef6ac630 f78e6410 00000000 e2e90b00 00000000 ef5e84c0
kernel: [17758077.728000] Call Trace:
kernel: [17758077.728000] [test_set_page_writeback+284/320] test_set_page_writeback+0x11c/0x140
kernel: [17758077.728000] [pg0+948556637/1069167616] nfs_flush_one+0xdd/0x190 [nfs]
kernel: [17758077.728000] [pg0+948552301/1069167616] nfs_find_request+0x3d/0x50 [nfs]
kernel: [17758077.728000] [pg0+948556909/1069167616] nfs_flush_list+0x5d/0xc0 [nfs]
kernel: [17758077.728000] [pg0+948559695/1069167616] nfs_flush_inode+0x8f/0xd0 [nfs]
kernel: [17758077.728000] [pg0+948551700/1069167616] nfs_writepages+0xb4/0x140 [nfs]
kernel: [17758077.728000] [__filemap_fdatawrite_range+107/128] __filemap_fdatawrite_range+0x6b/0x80
kernel: [17758077.728000] [filemap_fdatawrite+48/64] filemap_fdatawrite+0x30/0x40
kernel: [17758077.728000] [pg0+948518038/1069167616] nfs_sync_mapping+0x46/0x90 [nfs]
kernel: [17758077.728000] [pg0+948522552/1069167616] nfs_revalidate_mapping+0xa8/0xe0 [nfs]
kernel: [17758077.728000] [__up+28/32] __up+0x1c/0x20
kernel: [17758077.728000] [pg0+948511850/1069167616] nfs_file_write+0x9a/0x130 [nfs]
kernel: [17758077.728000] [do_sync_write+201/304] do_sync_write+0xc9/0x130
kernel: [17758077.728000] [update_atime+138/160] update_atime+0x8a/0xa0
kernel: [17758077.728000] [autoremove_wake_function+0/96] autoremove_wake_function+0x0/0x60
kernel: [17758077.728000] [vfs_write+214/432] vfs_write+0xd6/0x1b0
kernel: [17758077.728000] [sys_pwrite64+128/144] sys_pwrite64+0x80/0x90
kernel: [17758077.728000] [sysenter_past_esp+84/117] sysenter_past_esp+0x54/0x75
kernel: [17758077.728000] Code: 24 d3 ea 83 e2 3f 8d 8c 03 04 01 00 00 0f a3 11 19 c0 85 c0 75 03 0f ab 11 8b 5c 93 04 85 db 74 0a 45 83 ee 06 39 ef 75 cf eb a3 <0f> 0b 74 01 d4 a7 32 c0 eb ec 8d 76 00 55 57 56 53 83 ec 40 8b
kernel: [17758077.728000] <4>rtc: lost some interrupts at 2048Hz.
kernel: [17758078.152000] note: vmware-vmx[25602] exited with preempt_count 1

A Google search (kernel BUG at lib/radix-tree.c:372) show a problem with NFS file truncation race condition in combination with 2.6.15

Revision history for this message
Herbert Straub (herbert) wrote :

I installed Edgy(beta) with kernel 2.6.17 and the error is gone. With Dapper the error can reproduced every time (just create file i/o's in the virtual host, for example copy folder to other). With 2.6.17 i cannot produce the error. I see many changes in the nfs part of the kernel source from 2.6.15 to 2.6.17.

Revision history for this message
Takashi Takekawa (takekawa) wrote : Re: kernel BUG at lib/radix-tree.c:372!

I've encountered that bug on a native machine with kernel 2.6.15-28-amd64-server too.
And I found the web pages that contains the solution of this problem.

http://lkml.org/lkml/2006/3/1/381
http://lkml.org/lkml/2006/3/2/4

Revision history for this message
Takashi Takekawa (takekawa) wrote : nfsv4 client bug on linux-source-2.6.15

nfsv4 client machines with linux-image-2.6.15 sometimes hung out.
It had been fixed in later version of linux.
But linux-source in Ubuntu/6.06 has not yet been fixed.

I attach patch file of this problem.

The following page describe this patch:
http://lkml.org/lkml/2006/3/2/4

Revision history for this message
Gareth Fitzworthington (mapping-gp-deactivatedaccount) wrote : Re: kernel BUG at lib/radix-tree.c:372! on a vmware node

This bug has had no activity for a considerable period. This is a check to see if there is still interest in investigating this bug report.
I suspect this will be assigned "Won't Fix", but I'll leave it for the kernel Team to make that assessment.

Changed in linux-source-2.6.15:
status: New → Incomplete
Revision history for this message
Herbert Straub (herbert) wrote :

I have no more interest for further investigations. I upgraded to a newer kernel version.

Revision history for this message
Takashi Takekawa (takekawa) wrote : nfsv4

I'm still using Ubuntu 6.06, so I hope it will be fixed.
Kernel always freeze when it mount nfsv4.
Since I'm using patched kernel and it works well, please commit my patch attached before.

Changed in linux-source-2.6.15:
status: Incomplete → Confirmed
Revision history for this message
Gareth Fitzworthington (mapping-gp-deactivatedaccount) wrote : Re: kernel BUG at lib/radix-tree.c:372! on a vmware node

Takashi,
Just to clarify. This bug occurs always when mounting nfsv4 on release 6.06 (kernel 2.6.15) AND is not necessarily related to whether node is real or virtual (vmware).
Thanks.

Revision history for this message
Gareth Fitzworthington (mapping-gp-deactivatedaccount) wrote :

Sorry Takashi,
I re-read my last comment and it's not very clear.
I'm asking for your view on this statement.
Thanks.

Revision history for this message
Takashi Takekawa (takekawa) wrote : Re: kernel BUG at lib/radix-tree.c:372!

Gareth.

I think your statement is enough clear.
Machines on 6.06 (kernel 2.6.16) hungs some time after mounting nfsv4 file system.
This bug is not related to wheter node is real or virtual.

Thanks.

Revision history for this message
Gareth Fitzworthington (mapping-gp-deactivatedaccount) wrote : Re: kernel BUG at lib/radix-tree.c:372! on a vmware node

Takashi,
Have you ever found this occur with nfsv3 or is it strictly related nfsv4 ?
The links you submitted above seem to indicate that this issue occurs with any nfs mount (regardless of version).

Thanks.

Revision history for this message
Takashi Takekawa (takekawa) wrote :

Sorry, I forget whether this bug occurs in cases of nfsv3.
It may be regardless of version of nfs.

Thanks.

Revision history for this message
Gareth Fitzworthington (mapping-gp-deactivatedaccount) wrote :

Note to Kernel Team:
Takashi points to lkml.org patch here:
http://lkml.org/lkml/2006/3/2/4
I can find no reference to any similar bug listed with kernel.org or Debian (Open or closed).
I did find the following:
http://www.uwsg.iu.edu/hypermail/linux/kernel/0603.0/0378.html
which has some debug information on this and a summary.

This bug appears to have affected very few systems.
It also appears to have been solved by 2.6.17.

Changed in linux-source-2.6.15:
assignee: nobody → ubuntu-kernel-team
Revision history for this message
Mark Schouten (mark-prevented) wrote :

I have a machine that is affected by this bug. How may Dapper users (note that Dapper is supported for several more years on the server) work around this bug without upgrading to a handmade kernel? The scenario that our server is in is quite default, so I guess that it is to be expected that it is seen more than just by me and Takashi.

Can this be fixed, somehow?

Changed in linux-source-2.6.15:
assignee: ubuntu-kernel-team → colin-king
importance: Undecided → Medium
Revision history for this message
Colin Ian King (colin-king) wrote :

Hi. I've built a few version that contain the appropriate patch. Please let me know if this fixes the issue, and if so, I can put the patch in as a SRU.

Thanks

Colin

Revision history for this message
Colin Ian King (colin-king) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote :

By the way, if any other build is required, let me know and I will upload it.

Revision history for this message
Takashi Takekawa (takekawa) wrote : fixed linux-image

linux-image-2.6.15-52-amd64-server_2.6.15-52.67cking1_amd64.deb
works well.

Revision history for this message
Colin Ian King (colin-king) wrote :

Takashi Takekawa: Thanks for the verification.

Revision history for this message
Colin Ian King (colin-king) wrote :

SRU justification:

Impact: Kernel BUG in lib/radix-tree.c:372.

There is no serialisation between NFS asynchronous writebacks
and truncation at the page level due to the fact that nfs_sync_inode()
cannot lock the pages that it is about to write out.

This means that it is possible to be flushing out data (and calling something
like set_page_writeback()) while the page cache is busy evicting the page.

Fix: Backport of upstream cherry pick cd52ed35535ef443f08bf5cd3331d350272885b8

Testcases:

1. Dapper Kernel always freezes when it mounts nfsv4 filesystems
and
2. Virtual vmware machines are on a nfs mounted filesystem. One vmware
machine goes unuseable and comsumes all the CPU time. A kill -KILL pid
doesn't wipe out the vmware process. A ls -lh on the nfs mount hangs.
and
3. Create file i/o's in a virtual host, for example copy folder to other
and kernel BUG in lib/radix-tree.c occurs.

Changed in linux-source-2.6.15:
status: Confirmed → Fix Committed
Przemek K. (azrael)
Changed in linux-source-2.6.15 (Ubuntu):
status: Fix Committed → Fix Released
Martin Pitt (pitti)
Changed in linux-source-2.6.15 (Ubuntu):
status: Fix Released → Fix Committed
Changed in linux-source-2.6.15 (Ubuntu):
status: Fix Committed → Fix Released
assignee: Colin King (colin-king) → nobody
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers