Kernel 3.2.0-33 Introduces Ridiculous / Impossible Load Averages

Bug #1084264 reported by Shaun Thomas
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned

Bug Description

After upgrading to the 3.2.0-33 kernel, our load averages have been very, very odd. Here is a sample sar -q line:

15:05:02 runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 blocked
15:05:21 2 322 44.02 12.51 14.08 1

A run queue size of 2, with a load average of 44? So I wrote a script to track run queue and uninterruptable sleep, since both contribute to load average. Script is attached. Sample output during a run:

Stat Time Run+Sleep Load Avg
2012-11-28 15:47:53 2 71.54
2012-11-28 15:47:54 1 71.54
2012-11-28 15:47:55 1 71.54
2012-11-28 15:47:56 1 71.54
2012-11-28 15:47:57 1 71.54
2012-11-28 15:47:58 1 65.81

This behavior is not observed with kernel 3.2.0-31.

Problem can be reproduced with a basic pgbench test to simulate busy state, but only seems to occur during rapid process cycling. An easy simulation:

for x in {1..100}; do
  pgbench -T 5 -j 2 -c 2 pgbench
  sleep 1
done

Test may need to run for several minutes to trigger load spike. It will look similar to this:

Stat Time Run+Sleep Load Avg
2012-11-28 16:01:17 3 10.37
2012-11-28 16:01:18 3 9.54
2012-11-28 16:01:19 3 9.54
2012-11-28 16:01:20 2 9.54
2012-11-28 16:01:21 5 9.54
2012-11-28 16:01:22 3 9.54
2012-11-28 16:01:23 1 47.94

System info:

Description: Ubuntu 12.04.1 LTS
Release: 12.04
Linux 3.2.0-33-generic #52-Ubuntu SMP Thu Oct 18 16:29:15 UTC 2012 x86_64
linux-image-3.2.0-33-generic:
  Installed: 3.2.0-33.52

Revision history for this message
Shaun Thomas (0-sthomas) wrote :
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1084264

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: precise
Revision history for this message
Shaun Thomas (0-sthomas) wrote :

System is behind a firewall and can not contact Ubuntu servers, and is operating from a repository clone. apport-cli seems to think the linux-image-3.2.0-33-generic package is not from Ubuntu. I'll collect any necessary information manually if requested.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.7 kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.7-rc7-raring/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Shaun Thomas (0-sthomas) wrote :

Unfortunately DRBD 8.4.2 will not compile against the 3.7rc7 kernel, nor will iomemory-vsl, two vendor source-based modules that seem to rely on deprecated kernel API calls.

However, 3.4.20 from Quantal does not exhibit this behavior. Again, this is fixed by installing 3.4.20 on 12.04 LTS.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-fixed-upstream
Shaun Thomas (0-sthomas)
tags: removed: kernel-fixed-upstream
tags: added: kernel-fixed-upstream
Revision history for this message
Shaun Thomas (0-sthomas) wrote :

Upon further investigation, this seems to affect every Ubuntu 3.2 kernel. We've tested -24, -31, -33, and the upcoming -34. All exhibit impossible load swings. Only using 3.4 fixes this.

Revision history for this message
Shaun Thomas (0-sthomas) wrote :

Just tested 3.2.0-35. Wild load spikes still exist.

Stat Time Sleep Run Load Avg
2012-12-18 13:41:36 0 2 3.76
2012-12-18 13:41:37 1 4 3.76
2012-12-18 13:41:38 0 3 3.76
2012-12-18 13:41:39 0 1 49.58
2012-12-18 13:41:40 0 1 49.58

Revision history for this message
penalvch (penalvch) wrote :

Shaun Thomas, could you please gather the apport-collect following https://help.ubuntu.com/community/ReportingBugs#Filing_bugs_when_off-line ?

tags: added: needs-kernel-logs regression-updated
tags: added: regression-update
removed: regression-updated
tags: added: needs-bisect
tags: added: needs-reverse-bisect
removed: needs-bisect
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Shaun Thomas (0-sthomas) wrote :

I can no longer replicate this using the 3.2.0-54 kernel. I'm going to assume one of the intermediate versions fixed the issue.

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.