Ubuntu
linux package

Logs flooded with "nfs4_schedule_state_manager: kthread_run: -12"

Bug #1423472 reported by Sergio Gelato on 2015-02-19

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	linux (Ubuntu)	Confirmed	Medium	Unassigned

Bug Description

An NFSv4 client running kernel 3.13.0-44-generic #73-Ubuntu (amd64) suddenly started spewing
nfs4_schedule_state_manager: kthread_run: -12
log messages at an average rate of 2.65 kHz. It did not stop until I rebooted it.

At the very least that message needs to be rate-limited. (Doesn't seem to be fixed upstream yet.)

As for the underlying problem, -12 is -ENOMEM. I'm afraid I have no idea why the kernel ran out of memory at that point. WIll follow up if the problem ever recurs. This bug report is mainly about the lack of rate limiting.

Tags:

Revision history for this message

Brad Figg (brad-figg) wrote on 2015-02-19: Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1423472

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status:	New → Incomplete
tags:	added: trusty

Revision history for this message

Sergio Gelato (sergio-gelato) wrote on 2015-02-19:

Logs too big for inclusion (the problem was log flooding). Also, they would be missed by apport-collect because /var had been filled by an earlier, not necessarily related, problem; the only full copy of the logs is on a remote syslog server which does not run Ubuntu.

Changed in linux (Ubuntu):
status:	Incomplete → Confirmed

Revision history for this message

Joseph Salisbury (jsalisbury) wrote on 2015-02-19:

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.19 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-vivid/

Changed in linux (Ubuntu):
importance:	Undecided → Medium
status:	Confirmed → Incomplete

Revision history for this message

Sergio Gelato (sergio-gelato) wrote on 2015-02-19:

So far, this particular symptom has been seen exactly once. The host it was observed on was reinstalled from scratch with trusty a few weeks ago following a hard disk failure, so no dist-upgrade involved. I did some NFS-client-related tuning on this and many other machines this week so it's conceivable that this has caused new code paths to be exercised (although the changes were rather benign: a longer credential timeout in rpc.gssd, an explicit port number for nfs.nfs_callback_tcpport, a smaller value for auth_rpcgss.key_expire_timeo, and only the rpc.gssd change had actually taken effect on that host at the time of the incident).

I've looked at the source code "for kthread_run" (or rather the function behind that macro). The error is the result of a memory allocation failure. What caused the kernel to run out of memory (this machine has 32GB of RAM, by the way) last night is probably unknowable at this point, and need not have had anything to do with NFS. *This* bug report is only about the fact that the issuing of that particular error message (from fs/nfs/nfs4state.c:nfs4_schedule_state_manager()) is not rate-limited (neither in 3.13 nor in the linux-stable tree at git.kernel.org), which put an undesirable load on my syslog infrastructure. That should be easy to fix: it's what pr_warn_ratelimited() is for.

I cannot reproduce the symptom at will, so I won't actually test the kernel from vivid: any negative result would be inconclusive. Since I know from reading the source code that the message is still not rate-limited upstream, I assume that kernel-bug-exists-upstream is the right choice.

tags:	added: kernel-bug-exists-upstream
Changed in linux (Ubuntu):
status:	Incomplete → Confirmed

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntulinux package

Logs flooded with "nfs4_schedule_state_manager: kthread_run: -12"

Bug Description

Other bug subscribers

Remote bug watches

Ubuntu
linux package