CPU time incorrect in 10.04

Bug #889012 reported by kng ops
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux-ec2 (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

I want to report that the CPU time bug is still present in the following kernel

2.6.32-317-ec2 on AWS instance m1.large

with CPU

model name : Intel(R) Xeon(R) CPU E5507 @ 2.27GHz

I logged the output of ps every minute and I can now show the bug happening, as you can see the CPU time originally at 35790:43 does get corrupted and the next minute jumps at 17179869:11 to reach 29322904:44 1 minute later

www-data 8269 31.3 9.2 853516 725496 ? Sl Aug23 35789:36 /usr/local/lib/erlang/erts-5.8.4/bin/beam.smp -P ...
www-data 8269 31.3 9.2 866128 731640 ? Sl Aug23 35790:09 /usr/local/lib/erlang/erts-5.8.4/bin/beam.smp -P ...
www-data 8269 31.3 9.3 868704 738800 ? Sl Aug23 35790:43 /usr/local/lib/erlang/erts-5.8.4/bin/beam.smp -P ...
www-data 8269 31.3 9.4 872964 742660 ? Sl Aug23 35791:17 /usr/local/lib/erlang/erts-5.8.4/bin/beam.smp -P ...
www-data 8269 120093018 9.4 885060 748036 ? Sl Aug23 17179869:11 /usr/local/lib/erlang/erts-5.8.4/bin/beam.smp -P ... www-data 8269 10652 9.3 868748 739144 ? Sl Aug23 29322904:44 /usr/local/lib/erlang/erts-5.8.4/bin/beam.smp -P ...

It took us around 3 months running this process at around 30% load to reach this condition

Because of this problem the machine currently reports a load of 0 when instead we know it should report between 20 and 40% depending by the moment of the day.

Please let me know If I need to open a specific ticket for this issue.

Today we're going to upgrade to 2.6.32-319-ec2

thanks,

Paolo Negri

Tags: aws ec2 kernel
Revision history for this message
kng ops (paolo-negri) wrote :

we keep being affected by this bug. We're now using 8 cores machine and have more load, this is forcing us to reboot servers every 30 days soon to be 20 because for increased load.

Last observation on kernel 2.6.32-340-ec2

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux-ec2 (Ubuntu):
status: New → Confirmed
Revision history for this message
The Gavitron (me-gavitron) wrote :

Here's some output from one of our production instances on kernel 312. This machine was rebooted Tue Aug 6 10:23:13 PDT 2013, it is currently Mon Aug 26 15:05:56 PDT 2013. the instance runs some simple 0mq workers written in PHP.

# uptime
 15:04:30 up 20 days, 4:29, 1 user, load average: 0.02, 0.03, 0.00

# uname -a
Linux job2.hootsuite.com 2.6.32-312-ec2 #24-Ubuntu SMP Fri Jan 7 18:30:50 UTC 2011 x86_64 GNU/Linux

# ps -e |sort -k 3,3|tail -19
 7051 ? 00:05:23 php
  801 ? 00:05:31 hsflowd
  612 ? 01:05:09 rsyslogd
14192 ? 1184011132-12:04:39 php
14276 ? 1184011132-12:04:39 php
 6219 ? 1184011132-12:04:39 daemon
 6242 ? 1184011132-12:04:39 daemon
 6273 ? 1184011132-12:04:39 daemon
14128 ? 1370665095-06:52:34 php
13944 ? 1421410859-03:29:16 php
14261 ? 2042750126-07:22:53 php
  675 ? 23-23:34:52 mongos
  539 ? 24-17:36:51 diamond
14031 ? 3032707861-22:16:02 php
14127 ? 3132481888-10:28:53 php
  676 ? 3133620092-23:21:57 mongos2
14142 ? 3343608664-22:22:30 php
14123 ? 3598631500-18:17:59 php
  PID TTY TIME CMD

Revision history for this message
Anand (anand-basu) wrote :

Hi guys,

Ive changed the status to fix released by mistake. Can you please revert it back to confirmed ?

Changed in linux-ec2 (Ubuntu):
status: Confirmed → Fix Released
Stefan Bader (smb)
Changed in linux-ec2 (Ubuntu):
status: Fix Released → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.