nfs4 causes enormous load in ubuntu-server making it unusable

Bug #1006446 reported by perpetualrabbit
134
This bug affects 24 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Triaged
High
Unassigned
nfs-utils (Ubuntu)
Confirmed
High
Unassigned

Bug Description

Problem:
--------------------------------------------------------------
I just had to remove ubuntu server 12.04 to install redhat enterprise linux 6. The intermittent slowness was
completely unacceptable for the users, who have workstations with /home mounted with nfs4 on this server.
The mail server, also accesses the /home because the /home/$USER/Maildir directories are there.
Using nfs4, the kernel nfs threads caused enormous load.

The users had frozen desktops (greyed out windows) and mail slowed or arrived days later as a result.

With RHEL6, all nfs4 problems are completely gone. I used the exact same /etc/exports file, and the same
settings and mount options on the workstations, the same number of nfs threads.
Both the redhat and ubuntu systems are KVM virtual guests on an redhat 6 virtual host (one of 3 actually).
The storage backend is a very fast equallogic array, which exports iscsi targets to the virtual hosts.

I am sorry, but I have to conclude the current nfs4 implementation of ubuntu server 12.04 is NOT fit for use.
A complete university department suffered for weeks while I tried to solve the problems with ubuntu, but
in the end it was decided to install redhat instead, re-using the same iscsi targets for system, home and data.
A missed chance for ubuntu...

Therefore I urge Canonical's people to classify this bug as critical.

Also I think quality assurance should have caught this bug before shipping.

Analysis
------------------------------------------------------------
LOAD:
The nfs threads cause the kernel to use enormous amounts of 'sy' time as measured in top.
I will attach a sample of top's output, of a particular _quiet_ time on the network. Load is 7.82.
On busier moments, the load went through the roof, beyond 50 and further. It consumes
actual CPU cycles.
Each thread consumes upto 30% of a cpu core. I enabled 128 threads.
rx an tx block sizes are 32768 on the clients. Both server and clients used async, both on redhat and ubuntu.

SYSTEM vs IO-WAIT:
The replacement redhat system can surely be overloaded, but then it does not consume CPU cycles doing so. Top does report high load, but it spends in in the 'wa' state. This indicates it is simply waiting for its backend iscsi devices to complete writes. I tested this by simultaneously letting all workstations write multi-gigabyte files with dd to /home.
On ubuntu, the nfs threads spend their time in 'sy', doing who-knows-what.

LOGS:
Nothing at all appears in the logs. But when I set bitwise debug options in the /proc/sys/sunrpc/*debug files,
lots of log entries appear. Those seem like normal NFS protocol messages to me though.
I also tried to discover what was happening with wireshark, but the traffic looks like normal nfs4 traffic to me.

SLOWNESS:
That is the thing. The ubuntu nfs server is actually NOT slow at all. During my dd tests, it wrote half a gigabyte per second to its iscsi backends. It's _throughput_ is better than that of the redhat server.
As far as I can tell, it falls down because it makes client side processes that want to do IO wait on other writes. A simple 'ls' has to wait until a write has been completed. And both server and client used async nfs. People's firefoxes freeze all the time because firefox need to read and write a lot to its cache and other files in the .mozilla directory.
The dovecot imap server almost grinds to a halt trying to write all those little files in people's /home/$USER/Maildir's.
The problems go on and on. Basically, a complete network of workstations is almost unusable because of this.
Upfront tests were done of course, but showed only the excellent throughput but not the appalling `waiting´ behaviour.
With redhat 6 there is no such problem.

SITUATION:
People use their own, and each others linux workstations for science, doing large calculations and writing a lot of big and small files to nfs. The nfs server serves /export/home, and also raw data storage from /export/data with nfs4. The clients mount those under /home and /data/misc respectively. Also there is a read-only software mount for certain scientific packages.

CONFIG:
client fstab lines:
#### nfs entries ###
sw.lorentz.leidenuniv.nl:/sw /sw nfs4 hard,intr,ro,tcp,rsize=32768,wsize=32768,bg,acl,async
home.lorentz.leidenuniv.nl:/home /home nfs4 hard,intr,rw,tcp,rsize=32768,wsize=32768,bg,acl,async

server exports file:
/export 132.229.227.0/24(ro,sync,insecure,root_squash,no_subtree_check,nohide,fsid=0)\
              132.229.216.128/26(ro,sync,insecure,no_root_squash,no_subtree_check,nohide,fsid=0)\
                                132.229.226.3(ro,sync,insecure,no_root_squash,no_subtree_check,nohide,fsid=0)\
                                132.229.226.4(ro,sync,insecure,no_root_squash,no_subtree_check,nohide,fsid=0)
/export/home 132.229.227.0/24(rw,async,insecure,root_squash,no_subtree_check,nohide)\
                                132.229.216.128/26(rw,async,insecure,no_root_squash,no_subtree_check,nohide)\
                                132.229.226.3(rw,async,insecure,no_root_squash,no_subtree_check,nohide)\
                                132.229.226.4(rw,async,insecure,no_root_squash,no_subtree_check,nohide)\
                                132.229.214.41(rw,async,insecure,no_root_squash,no_subtree_check,nohide)
/export/data 132.229.227.0/24(rw,async,insecure,root_squash,no_subtree_check,nohide)\
                                132.229.216.128/26(rw,async,insecure,no_root_squash,no_subtree_check,nohide)\
                                132.229.226.3(rw,async,insecure,no_root_squash,no_subtree_check,nohide)\
                                132.229.226.4(rw,async,insecure,no_root_squash,no_subtree_check,nohide)
/export/sw 132.229.227.0/24(rw,async,insecure,root_squash,no_subtree_check,nohide)\
                                132.229.216.128/26(rw,async,insecure,no_root_squash,no_subtree_check,nohide)\
                                132.229.226.3(rw,async,insecure,no_root_squash,no_subtree_check,nohide)\
                                132.229.226.4(rw,async,insecure,no_root_squash,no_subtree_check,nohide)

root@gaia:~# lsb_release -rd
Description: Ubuntu 12.04 LTS
Release: 12.04

root@gaia2:~# dpkg -l | grep -E 'nfs|linux-image'
ii libnfsidmap2 0.25-1ubuntu2 NFS idmapping library
ii linux-image-3.2.0-18-generic 3.2.0-18.29 Linux kernel image for version 3.2.0 on 64 bit x86 SMP
ii linux-image-3.2.0-19-generic 3.2.0-19.31 Linux kernel image for version 3.2.0 on 64 bit x86 SMP
ii linux-image-3.2.0-20-generic 3.2.0-20.33 Linux kernel image for version 3.2.0 on 64 bit x86 SMP
ii linux-image-3.2.0-21-generic 3.2.0-21.34 Linux kernel image for version 3.2.0 on 64 bit x86 SMP
ii linux-image-3.2.0-23-generic 3.2.0-23.36 Linux kernel image for version 3.2.0 on 64 bit x86 SMP
ii linux-image-3.2.0-24-generic 3.2.0-24.39 Linux kernel image for version 3.2.0 on 64 bit x86 SMP
ii linux-image-server 3.2.0.24.26 Linux kernel image on Server Equipment.
ii nfs-common 1:1.2.5-3ubuntu3 NFS support files common to client and server
ii nfs-kernel-server 1:1.2.5-3ubuntu3 support for NFS kernel server
ii nfswatch 4.99.11-1 Program to monitor NFS traffic for the console

WHAT I EXPECTED TO HAPPEN
----------------------------------
A fast and responsive nfs service.

WHAT HAPPENED INSTEAD
-----------------------------
I got fast, but also intermittently totally unresponsive.
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.0.1-0ubuntu7
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info: Error: [Errno 2] No such file or directory
Card0.Amixer.values: Error: [Errno 2] No such file or directory
CurrentDmesg: [ 17.280028] eth0: no IPv6 routers present
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=aafa9be4-19ac-4d74-a853-a5532ddedf5d
InstallationMedia: Ubuntu-Server 12.04 LTS "Precise Pangolin" - Alpha amd64 (20120313)
IwConfig:
 lo no wireless extensions.

 eth0 no wireless extensions.
Lsusb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: Red Hat KVM
NonfreeKernelModules: nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc ext2 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd psmouse serio_raw lp parport i2c_piix4 soundcore snd_page_alloc virtio_balloon mac_hid floppy
Package: linux (not installed)
ProcEnviron:
 LANGUAGE=en_US:en
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 EFI VGA
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-24-generic root=/dev/mapper/gaia-root ro
ProcVersionSignature: Ubuntu 3.2.0-24.39-generic 3.2.16
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-24-generic N/A
 linux-backports-modules-3.2.0-24-generic N/A
 linux-firmware 1.79
RfKill: Error: [Errno 2] No such file or directory
Tags: precise
Uname: Linux 3.2.0-24-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

dmi.bios.date: 01/01/2007
dmi.bios.vendor: Seabios
dmi.bios.version: 0.5.1
dmi.chassis.type: 1
dmi.chassis.vendor: Red Hat
dmi.modalias: dmi:bvnSeabios:bvr0.5.1:bd01/01/2007:svnRedHat:pnKVM:pvrRHEL6.2.0PC:cvnRedHat:ct1:cvr:
dmi.product.name: KVM
dmi.product.version: RHEL 6.2.0 PC
dmi.sys.vendor: Red Hat

Brad Figg (brad-figg)
affects: linux-meta (Ubuntu) → linux (Ubuntu)
Revision history for this message
perpetualrabbit (perpetualrabbit) wrote :
Download full text (10.3 KiB)

I forgot the `top´ example

roland@asselijn:~$ cat top_example
top - 16:08:57 up 13 days, 5:13, 3 users, load average: 7.82, 5.95, 5.28
Tasks: 290 total, 5 running, 285 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0%us, 30.2%sy, 0.0%ni, 69.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 0.0%us, 29.5%sy, 0.0%ni, 70.4%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st
Cpu2 : 0.1%us, 29.0%sy, 0.0%ni, 70.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.0%us, 28.4%sy, 0.0%ni, 71.5%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st
Cpu4 : 0.0%us, 33.2%sy, 0.0%ni, 6.0%id, 0.0%wa, 0.0%hi, 60.9%si, 0.0%st
Cpu5 : 0.0%us, 27.5%sy, 0.0%ni, 72.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.0%us, 29.9%sy, 0.0%ni, 70.0%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.0%us, 28.6%sy, 0.0%ni, 71.3%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 4049220k total, 3575608k used, 473612k free, 633452k buffers
Swap: 3997692k total, 38652k used, 3959040k free, 1847884k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31641 root 20 0 0 0 0 S 11.4 0.0 6:41.78 [nfsd]
31697 root 20 0 0 0 0 S 11.3 0.0 11:54.63 [nfsd]
31727 root 20 0 0 0 0 S 11.0 0.0 9:41.43 [nfsd]
31626 root 20 0 0 0 0 S 10.9 0.0 7:36.62 [nfsd]
31695 root 20 0 0 0 0 S 10.4 0.0 7:22.91 [nfsd]
31729 root 20 0 0 0 0 S 9.8 0.0 7:10.08 [nfsd]
31652 root 20 0 0 0 0 S 9.7 0.0 7:07.46 [nfsd]
31613 root 20 0 0 0 0 S 9.5 0.0 10:56.82 [nfsd]
31629 root 20 0 0 0 0 S 9.3 0.0 8:47.71 [nfsd]
31617 root 20 0 0 0 0 S 9.2 0.0 13:58.92 [nfsd] ...

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1006446

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: precise
Revision history for this message
perpetualrabbit (perpetualrabbit) wrote :

As this is a server system, there is no firefox or X window system on it.
Apport-collect starts a text-mode browser and in that I could authorize the collecting of logs, it seems.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
perpetualrabbit (perpetualrabbit) wrote : AcpiTables.txt

apport information

tags: added: apport-collected
description: updated
Revision history for this message
perpetualrabbit (perpetualrabbit) wrote : AlsaDevices.txt

apport information

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote : BootDmesg.txt

apport information

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote : Card0.Codecs.codec.0.txt

apport information

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote : Lspci.txt

apport information

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote : PciMultimedia.txt

apport information

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote : ProcInterrupts.txt

apport information

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote : ProcModules.txt

apport information

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote : UdevDb.txt

apport information

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote : UdevLog.txt

apport information

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote : WifiSyslog.gz

apport information

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.4kernel[1] (Not a kernel in the daily directory). Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.4-quantal/

Changed in linux (Ubuntu):
importance: Undecided → High
status: Confirmed → Incomplete
tags: added: kernel-da-key
Revision history for this message
perpetualrabbit (perpetualrabbit) wrote :

You realize this server has been decommissioned, right? I can no longer do real life test on it.
What I can do is give it a disk (well virtual disk, it's iscsi) and export that to all workstations and test with that.
Just what I that in pre-production testing. I now know I have to watch out for high load, and can tell you
the result.

Ok, I hope I'm doing this right. I downloaded these:

linux-headers-3.4.0-030400-generic_3.4.0-030400.201205210521_amd64.deb
linux-image-3.4.0-030400-generic_3.4.0-030400.201205210521_amd64.deb

I installed them with dpkg, and booted. Here is a uname -a:

Linux gaia2 3.4.0-030400-generic #201205210521 SMP Mon May 21 09:22:02 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

Now I have to make mounts and do testing, which will take a while.

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote :

All right,

I am doing tests right now, but as far as I can see the 3.4.0-030400-generic kernel has the same behaviour as stock ubuntu 12.04.

With ubuntu server and this 3.4 kernel:
Test:
- all workstation writing a separate 4GB file with `dd´ to /testhome, which is nfs4 mounted from an ubuntu server 12.04.
Result:
- all other processes started on the same machine, that want to read or write to the same mountpoint, after this dd began just block.
   for instance:
      ls just waits until dd is finished.
      commandline completion never finishes and the shell just blocks.
- load on the server goes beyond 90, and nfs kernel threads consume lots of cpu. There is also a lot of io waiting,
  which is to be expected.
- Also the nfs kernel threads are unevenly loaded it seems. There is a bunch (10-20) which use
about 20-30%, and it tapers off to 3-0.5%.

With RHEL6 server:
Test:
- same test, but to /data/misc, mounted nfs4 from redhat 6.3 server.
Result:
- processes started after dd began, wanting to read or write to the same mountpoint (/data/misc) block a few seconds but then start.
   for instance:
      ls waits for 5-6 seconds and then just answers.
     commandline completion waits about the same time and then completes the command or whatever it completes.
- load on the server is also over 90, but the nfs kernel threads consume not so much cpu. Lots of io waiting.
- nfs kernel threads are very evenly loaded. I see a long list of them using about 3% each. It goes up to maybe 5% and down
to about 2%, and some of the 128 available threads are unused. I think more than half of them are active during the test.

Keep in mind that this is a pretty heavy test (well I think it is anyway). But the ubuntu server already had big problems when the nfs usage is normal. Normal in our case means:
- some people editing a few text files
- a lot of people using web or mail, and therefore their firefox cache and the mail server accesses nfs-mounted /home/$USER/Maildir
- some people doing heavy calculations and writing either a lot of small files or a few big ones. Small means about 1MB, big means several GB.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Christoph Bartoschek (mrzx4l98d4tp89jab6giohdrjqysbyjs4npz2ccq25kvjmf5h8u4cmidcko7s4tfr6ur1teuv4ju1af4klpdqp0vd87ddz5sh53mwfrw5-12uyq2-hwbqs6tox1bv6csee9psn5309v7488f3dugifm692db2xfq8n1fsz7l87835tr0q36m2p3ftwpoqoy6v6uuahmk4tf5123r12379utugp) wrote :

We see a similar problem. We have a RAID < DRBD < LVM < EXT4 < NFS configuration that worked fine with Ubuntu 10.04. After upgrading to 12.04 LTS the performance dropped significantly. The clients block often, editing files is a pain and so on. We consider going back to Ubuntu 10.04.

Revision history for this message
Christoph Bartoschek (mrzx4l98d4tp89jab6giohdrjqysbyjs4npz2ccq25kvjmf5h8u4cmidcko7s4tfr6ur1teuv4ju1af4klpdqp0vd87ddz5sh53mwfrw5-12uyq2-hwbqs6tox1bv6csee9psn5309v7488f3dugifm692db2xfq8n1fsz7l87835tr0q36m2p3ftwpoqoy6v6uuahmk4tf5123r12379utugp) wrote :

I forgot to mention that we see the probelms with NFS and NFS4. It does not matter which version we use.

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote :

@Christoph:
Can you tell me more about your situation:
- How many clients?
- What is mounted and how?
- What kind of authentication (ldap, kerberos, AD, passwd/shadow,...)?
- Real nfs server or virtual?
- What resources (memory, # cpu's, network bandwidth, etc)?

What did you try to diagnose and solve this?

Revision history for this message
Christoph Bartoschek (mrzx4l98d4tp89jab6giohdrjqysbyjs4npz2ccq25kvjmf5h8u4cmidcko7s4tfr6ur1teuv4ju1af4klpdqp0vd87ddz5sh53mwfrw5-12uyq2-hwbqs6tox1bv6csee9psn5309v7488f3dugifm692db2xfq8n1fsz7l87835tr0q36m2p3ftwpoqoy6v6uuahmk4tf5123r12379utugp) wrote :

- How many clients?

About 20

- What is mounted and how?

The home directories of the users via automount. We use default mount options but nail it to nfs 3.

- What kind of authentication (ldap, kerberos, AD, passwd/shadow,...)?

NIS

- Real nfs server or virtual?

Real

- What resources (memory, # cpu's, network bandwidth, etc)?

What do you mean by this?

- What did you try to diagnose and solve this?

So far we measuered the bandwidth at different levels of the storage hierarchy. Local access to the filesystem is fast enough. More than the 100 MBytes/s that saturate the 1GB Ethernet.

What did you try to diagnose and solve this?

Revision history for this message
Christoph Bartoschek (mrzx4l98d4tp89jab6giohdrjqysbyjs4npz2ccq25kvjmf5h8u4cmidcko7s4tfr6ur1teuv4ju1af4klpdqp0vd87ddz5sh53mwfrw5-12uyq2-hwbqs6tox1bv6csee9psn5309v7488f3dugifm692db2xfq8n1fsz7l87835tr0q36m2p3ftwpoqoy6v6uuahmk4tf5123r12379utugp) wrote :

I also see that the load goes up when it becomes really slow. I have 128 nfsd threads and 8 cores. The load is currently at 100.

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote :

@christoph
With resources I mean: Our virtual nfs server was assigned 4 virtual cpu's, 8GB ram and about 8Gbit/s network, depending on how much the other virtual guests are using the network.
And yours?

Revision history for this message
Christoph Bartoschek (mrzx4l98d4tp89jab6giohdrjqysbyjs4npz2ccq25kvjmf5h8u4cmidcko7s4tfr6ur1teuv4ju1af4klpdqp0vd87ddz5sh53mwfrw5-12uyq2-hwbqs6tox1bv6csee9psn5309v7488f3dugifm692db2xfq8n1fsz7l87835tr0q36m2p3ftwpoqoy6v6uuahmk4tf5123r12379utugp) wrote :

Our servers have 2x4 cores, 48 GiB RAM and 1 Gbit Ethenet each. They are directly connected via 10 Gbit Ethernet.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This issue also appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report at bugzilla.kernel.org [1]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

If you are comfortable with opening a bug upstream, It would be great if you can report back the upstream bug number in this bug report. That will allow us to link this bug to the upstream report.

[1] https://wiki.ubuntu.com/Bugs/Upstream/kernel

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Changed in nfs-utils (Ubuntu):
importance: Undecided → High
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Adding in the nfs-utils package for review by the server team.

Revision history for this message
Christoph Bartoschek (mrzx4l98d4tp89jab6giohdrjqysbyjs4npz2ccq25kvjmf5h8u4cmidcko7s4tfr6ur1teuv4ju1af4klpdqp0vd87ddz5sh53mwfrw5-12uyq2-hwbqs6tox1bv6csee9psn5309v7488f3dugifm692db2xfq8n1fsz7l87835tr0q36m2p3ftwpoqoy6v6uuahmk4tf5123r12379utugp) wrote :

I see the following suspicious result:

root@d1106i06:~# ps auxwww | grep migr
root 6 0.0 0.0 0 0 ? S Jun01 0:00 [migration/0]
root 8 0.0 0.0 0 0 ? S Jun01 0:00 [migration/1]
root 13 89.8 0.0 0 0 ? S Jun01 7000:17 [migration/2]
root 17 0.0 0.0 0 0 ? S Jun01 0:00 [migration/3]
root 21 0.0 0.0 0 0 ? S Jun01 0:00 [migration/4]
root 25 0.0 0.0 0 0 ? S Jun01 0:00 [migration/5]
root 29 0.0 0.0 0 0 ? S Jun01 0:00 [migration/6]
root 33 0.0 0.0 0 0 ? S Jun01 0:00 [migration/7]

Revision history for this message
Christoph Bartoschek (mrzx4l98d4tp89jab6giohdrjqysbyjs4npz2ccq25kvjmf5h8u4cmidcko7s4tfr6ur1teuv4ju1af4klpdqp0vd87ddz5sh53mwfrw5-12uyq2-hwbqs6tox1bv6csee9psn5309v7488f3dugifm692db2xfq8n1fsz7l87835tr0q36m2p3ftwpoqoy6v6uuahmk4tf5123r12379utugp) wrote :

I also see that 12.04 uses cfq as I/O scheduler. 10.04 used deadline. After switching the I/O scheduler to deadline the load quickly went down from about 100 to 10. I have to measure whether the performance is acceptable now.

Revision history for this message
perpetualrabbit (perpetualrabbit) wrote :

@Christoph Bartoschek
Any load over the number of cpu cores (X) is not acceptable.
You have X camels. Each camel can carry one sack of flour. If you want to move more than X sacks of flour, some camels will have to make second trips.

That is how `load´ works, right? Any load over X will make the performance collapse in some way. Throughput will level off or even decrease. You get time-outs or at least increased latency.

Since you have 8 cores, even load 10 is simply way too high. 2x4 cores, 48 GB and gigabit? If these are only nfs servers, and for only 20 clients, they are way too powerful. You could serve 500 to 1000 clients on that. You probably run other service too? Web, mail etc?

Of course there is no reason that the nfs daemons should cause so much cpu usage. 10.04 didn't, so why should that suddenly happen now?
Your load should normally be far below one, for an nfs server.

As far as I know, there haven't been major changes in linux's NFS stack in the last 3 -4 years, but there have been in the storage layer, such as IO scheduling. The fact that changing the scheduler makes the load go from 100 to 10 is telling.

Revision history for this message
Christoph Bartoschek (mrzx4l98d4tp89jab6giohdrjqysbyjs4npz2ccq25kvjmf5h8u4cmidcko7s4tfr6ur1teuv4ju1af4klpdqp0vd87ddz5sh53mwfrw5-12uyq2-hwbqs6tox1bv6csee9psn5309v7488f3dugifm692db2xfq8n1fsz7l87835tr0q36m2p3ftwpoqoy6v6uuahmk4tf5123r12379utugp) wrote :

@perpetualrabbit On an IO bound system high load is normally not a problem. The processes normally wait for IO and still count for load. Thus if the hardware is capable of performing 100 IO operations in parallel a load of 100 is not bad even if there is only one cpu.

These are only NFS servers. Our clients run EDA applications on large VLSI designs. So 20 clients easily saturate 1 GBit ethernet. However we now even do not saturate the network.

I ran some performance tests after switching to deadline. The experience is a little bit smoother now but the throughput is still bad.

Revision history for this message
Christoph Bartoschek (mrzx4l98d4tp89jab6giohdrjqysbyjs4npz2ccq25kvjmf5h8u4cmidcko7s4tfr6ur1teuv4ju1af4klpdqp0vd87ddz5sh53mwfrw5-12uyq2-hwbqs6tox1bv6csee9psn5309v7488f3dugifm692db2xfq8n1fsz7l87835tr0q36m2p3ftwpoqoy6v6uuahmk4tf5123r12379utugp) wrote :

After running several days now with the deadline scheduler I can say that the responsiveness of the system is ok now. The throughput still needs improvement.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nfs-utils (Ubuntu):
status: New → Confirmed
Revision history for this message
Jeff Ebert (jeffrey-ebertland) wrote :

I am experiencing the same behavior in a small network with 1 Gb ethernet connection between client and server. Large files can be moved over Samba quickly. The same files over NFSv4 move very slow and pause for long periods of time. I also see the high load due to nfsd threads.

This problem started with an upgrade to precise on both client and server. Before that, I had oneiric on the client and maverick on the server (and various other versions before that also worked well).

Revision history for this message
Jeff Ebert (jeffrey-ebertland) wrote :

'top' listing on server with single NFS client trying to copy a single file.

top - 00:01:50 up 22 min, 1 user, load average: 0.73, 0.50, 0.51
Tasks: 151 total, 4 running, 147 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.9%us, 39.8%sy, 0.0%ni, 58.2%id, 0.0%wa, 1.1%hi, 0.0%si, 0.0%st
Mem: 3604260k total, 1346468k used, 2257792k free, 95932k buffers
Swap: 10747388k total, 0k used, 10747388k free, 658672k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 3213 root 20 0 0 0 0 R 26 0.0 0:13.84 nfsd
 3218 root 20 0 0 0 0 R 23 0.0 0:12.53 nfsd
 3215 root 20 0 0 0 0 S 23 0.0 0:06.97 nfsd
 3214 root 20 0 0 0 0 S 16 0.0 0:13.97 nfsd
 3216 root 20 0 0 0 0 R 14 0.0 0:07.36 nfsd

The server is a dual-core machine (AMD Phenom II X2) with 4 GiB of RAM.

Revision history for this message
Jeff Taylor (shdwdrgn) wrote :

I have been having a similar problem since upgrading my NFS server from natty to oneiric two weeks ago. When copying files over an nfs share to another oneiric machine, my transfers run smooth and quick. However when transferring files to any machine with an older distribution (I have one on natty and another on lucid), transfers will halt completely for about 30 seconds, then continue for a minute or two before halting again. I have tested with both nfs3 and nfs4 connections, and there is no difference in transfers.

Note that when transferring a single file from oneiric->oneiric that there are always a greater number of threads, but significantly lower cpu usage (compared to transferring to natty), and never any freezes during the transfer process.

NFS server info:
Oneiric 3.0.0-20-generic-pae
Dual-core Intel P4 3Ghz with 3GB ram

When transferring files to Natty client:
top - 11:04:39 up 9 days, 10:58, 2 users, load average: 0.49, 0.94, 1.25
Tasks: 199 total, 5 running, 152 sleeping, 0 stopped, 42 zombie
Cpu0 : 3.1%us, 42.3%sy, 0.0%ni, 54.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 13.2%us, 37.3%sy, 0.0%ni, 49.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2957484k total, 2771924k used, 185560k free, 26556k buffers
Swap: 1293196k total, 78112k used, 1215084k free, 2205876k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 2761 root 20 0 0 0 0 R 28 0.0 0:11.34 nfsd
 2767 root 20 0 0 0 0 S 26 0.0 0:15.31 nfsd
 2765 root 20 0 0 0 0 R 21 0.0 0:16.76 nfsd
 2766 root 20 0 0 0 0 R 20 0.0 0:15.88 nfsd

When transferring files to Oneiric client:
top - 11:14:20 up 9 days, 11:07, 2 users, load average: 0.14, 0.30, 0.77
Tasks: 200 total, 1 running, 156 sleeping, 0 stopped, 43 zombie
Cpu0 : 0.7%us, 6.0%sy, 0.0%ni, 84.3%id, 4.7%wa, 0.0%hi, 4.3%si, 0.0%st
Cpu1 : 0.3%us, 4.0%sy, 0.0%ni, 94.7%id, 1.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2957484k total, 2767220k used, 190264k free, 28244k buffers
Swap: 1293196k total, 78088k used, 1215108k free, 2200132k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 2762 root 20 0 0 0 0 S 3 0.0 0:19.86 nfsd
 2765 root 20 0 0 0 0 S 3 0.0 0:25.06 nfsd
 2760 root 20 0 0 0 0 S 1 0.0 0:10.95 nfsd
 2761 root 20 0 0 0 0 S 1 0.0 0:20.20 nfsd
 2764 root 20 0 0 0 0 S 1 0.0 0:16.99 nfsd
 2766 root 20 0 0 0 0 S 1 0.0 0:25.58 nfsd
 2763 root 20 0 0 0 0 S 0 0.0 0:04.28 nfsd
 2767 root 20 0 0 0 0 S 0 0.0 0:26.92 nfsd

Revision history for this message
Jeff Taylor (shdwdrgn) wrote :

I attempted to change my I/O scheduler on the server to deadline as in post #29 above. I even added "elevator=deadline" to grub and rebooted, then confirmed the shared drives were indeed defaulted to deadline, however there has been no noticeable change in the response.

Revision history for this message
juuso puuso (heemeli) wrote :

I see the exact same thing as perpetualrabbit: client(s) experience nfs lockup for several tens of seconds - of the server side ~5 nfs threads killing the cpu.

This began on upgrading a client from 10.04 LTS to 12.04 LTS. The server is Debian Squeeze with 3.2.0 backport kernel. The server has no issues with Fedora 17 or previous versions of Ubuntu, each having their /home dir on the server over NFS4.

It seems the problem arises with files of largish size. I first noticed this watching a 10-minute Youtube-video, which I assumed creates a temporary file somewhere under /home/$ME - the hang does not happen on "private browsing".

This is to note, that I think this problem is specifically related to Ubuntu. I had no such problems when Fedora was at kernel 3.2

Revision history for this message
ECOM Development (7a532kg9lsp0v-support) wrote :

We also see this problem in our network.

Our setup with NO problems:
SERVER1 NFS server running Ubuntu 11.10 (Kernel 3.0.0-12-server) (2 x Quadcore Xeon 2.4GHz 16GB Ram)
SERVER2 Webserver running Ubuntu 8.04 (Kernel 2.6.27.7) (Dualcore Xeon 3.00GHz 2GB Ram)
SERVER3 Webserver running Ubuntu 12.04 (Kernel 3.2.0-29-generic) (2 x Quadcode Xeon 2.5GHz 8GB Ram)
SERVER4 Webserver running Ubuntu 11.10 (Kernel 3.0.0-12-server) (2 x Quadcore Xeon 2.4GHz 16GB Ram)

All webserver get there content from the NFS server.

We updated the 8.04 server to 12.04 (Kernel 3.2.0-23-generic-pae) and the problems started.
We did apache benchmark tests, when using SERVER4: no problems. Running the same test on SERVER 2 or 3 the load of both SERVER 2 & 3 go sky high (loads of 200+ seen)

I watched nfsiostat on SERVER 2 when running the apache benchmark on SERVER 3 and the read/write where 0.00 but the op/s where somewehe around 1200.00

Revision history for this message
Karsten Suehring (suehring) wrote :

+1

updated from 10.10 to 12.04 with more than 70 clients. nfsd starts to consume loads of CPU when clients start to access files

Revision history for this message
Karsten Suehring (suehring) wrote :

It seems that issue is easy to reproduce: I just did it using just two VirtualBox VMs wit the following steps:

1. Install Ubuntu 12.04.1 Server x64 with default settings (also default settings of Virtual box)
2. Update all packages (apt-get update, apt-get dist-upgrade)
3. Clone VM - rename one to server, the other to client and set bridged network interfaces, add DHCP assigned ip addresses to /etc/hosts
4. install nfs-kernel-server package on server
5. create directory /export and add following line to /etc/exports:
/export client(rw,no_root_squash,async,subtree_check)
6. export with "exportfs -a"
7. install nfs-common package on client
8. mount nfs share: sudo mount -t nfs server:/export /mnt
9. write a file: sudo dd if=/dev/zero of=/mnt/t1 bs=1M count=10000
10. check load on server with 'top': 40% distributed over 4 threads

I hope this helps. I can make the VMs available if somebody is interested.

Unless this is fixed or a workaround is found, I will have to downgrade my servers again :-(

Revision history for this message
Karsten Suehring (suehring) wrote :

Another note: I also upgraded one of my 12.04.1 servers to 12.10 (do-release-upgrade). The issue was not resolved by the new release.

Revision history for this message
Marcelo Salhab Brogliato (msbrogli) wrote :

I'm having the same issues: high CPU and slowness. Going back to nfs3 just solved to me while I'm waiting for an update.

Revision history for this message
David Abdurachmanov (david-abd) wrote :

I have noticed the same issue running Ubuntu 12.10 as NFSv4 server. The load averages on two core machine are 2.0, while CPU utilization is almost at 0%. It's extremely slow comparing to Fedora based NFS server.

Revision history for this message
Benjamin Redling (benjamin-rampe-deactivatedaccount) wrote :

+1

* clean install of 12.04 server.
* LDAP AUTH_SYS
* a single nfs client has no noticable problems, as soon as a second client starts both clients generate high CPU loads on their side (> 90%, Core 2 Duo) and on the server side (both nfsd at 14%, io-wait maximal 0.5%, Xeon X5649)

Changing to the deadline scheduler helped after a few minutes delay (client side CPU load due to nfs: <1%, server side nfsd processes rarely peak to 6%). I can confirm comment #29.

Revision history for this message
Kari Haapanen (karsta62) wrote :

I have a 32bit Ubuntu 12.04.2 LTS as an nfs server, which has been working ever since with Mint 13 64bit client among many others.
I haven't changed anything on it, but normal updates.

Now I installed a fresh 64bit Ubuntu 12.10 (3.5.0-17 kernel) as a client and now i have this issue when using this client.
Load does not climb up on the server, but CPU usage does.

Is this really a server-only issue, since all i changed was my client?

Revision history for this message
Kari Haapanen (karsta62) wrote :

Ok, this is my workaround.
I got it working by mounting from the client's fstab with these options:

rw,rsize=32768,wsize=32768

Are the sizes optimal for performance, i have not tested, but seems pretty ok.

Revision history for this message
Kari Haapanen (karsta62) wrote :

Without changing anything anywhere, the nightmare begun again.
Locks up as long as for ten minutes. Even sometimes nautilus claiming all bytes were transferred it locks up anyway.

Revision history for this message
Thomas Anantharaman (thomas-anantharaman) wrote :

I have been struggling with the same problem for the last few months, and accidentally stumbled on a fix that seems to work :

1. On clients set rsize=8192,wsize=8192 in /etc/fstab

I suspect that when multiple clients are writing to the NFS4 server simultaneously and overloading the network, the TCP network congestion control only works well for rsize,wsize <= 8192 (the maximum value under Linux when the code was originally written). For single clients larger values DO provide a benefit in higher throughput, but evidently there is some kernel bug triggered when using larger rsize and wsize values under conditions of network overload.

Revision history for this message
masakre (informatikoa1) wrote :

Is this bug solved?? I am having similar issues with my Ubuntu server 12.04.04 64 bits. I am using ldap authentication with NFS to mount users home. The server uses 2 gigabyte ethernet using bonding and there are 20 users and have 2x4 processors with 12 Gb ram (it is an ACER gateway gr320 f1 server). I change my server to the 12.04 becouse we were using an old version (7.10) and I was having some problems, but since the change, the clients are too slow. I notice that the server processors load it's high and I think this bug could be afecting me.

Sorry about my english and thank you in advance :)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.