Enable intel_pstate by default on Trusty EC2 AMIs

Bug #1477171 reported by Craig Watcham
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
In Progress
Medium
Unassigned

Bug Description

Please can the intel_pstate driver be enabled by default on the Ubuntu EC2 AMIs. On instances with support for C-states and P-States [1] there is a notable performance impact and lack of support for higher clock frequencies available due to the use of the acpi-cpufreq driver. For example on a c4.8xlarge the default frequency is limited to 2.9GHz when 3.2GHz is supported by the chip [2]. Some additional analysis is available here [3] and re-enabling it by default has been discussed [4] but this request is specifically for the EC2 AMIs.

[1] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/processor_state_control.html
[2] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/c4-instances.html
[3] http://www.deplication.net/2015/07/c-states-and-p-states-with-ubuntu-1404.html
[4] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1188647

AMI: ami-47a23a30 (eu-west-1)

$ lsb_release -d
Description: Ubuntu 14.04.2 LTS

$ uname -a
Linux ip-172-31-8-218 3.13.0-57-generic #95-Ubuntu SMP Fri Jun 19 09:28:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Default configuration (c4.8xl):
$ sudo cpupower frequency-info
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 10.0 us.
  hardware limits: 1.20 GHz - 2.90 GHz
  available frequency steps: 2.90 GHz, 2.90 GHz, 2.80 GHz, 2.70 GHz, 2.50 GHz, 2.40 GHz, 2.30 GHz, 2.20 GHz, 2.00 GHz, 1.90 GHz, 1.80 GHz, 1.70 GHz, 1.60 GHz, 1.40 GHz, 1.30 GHz, 1.20 GHz
  available cpufreq governors: conservative, ondemand, userspace, powersave, performance
  current policy: frequency should be within 1.20 GHz and 2.90 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1.20 GHz (asserted by call to hardware).
  cpufreq stats: 2.90 GHz:70.40%, 2.90 GHz:0.00%, 2.80 GHz:0.00%, 2.70 GHz:0.00%, 2.50 GHz:0.00%, 2.40 GHz:0.00%, 2.30 GHz:0.00%, 2.20 GHz:0.00%, 2.00 GHz:0.00%, 1.90 GHz:0.00%, 1.80 GHz:0.00%, 1.70 GHz:0.00%, 1.60 GHz:0.09%, 1.40 GHz:0.00%, 1.30 GHz:0.00%, 1.20 GHz:29.51% (5)
  boost state support:
    Supported: yes
    Active: yes

With intel_pstate enabled (c4.8xl):
$ sudo cpupower frequency-info
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 0.97 ms.
  hardware limits: 1.20 GHz - 3.50 GHz
  available cpufreq governors: performance, powersave
  current policy: frequency should be within 1.20 GHz and 3.50 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency is 3.20 GHz (asserted by call to hardware).
  boost state support:
    Supported: yes
    Active: yes
---
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Aug 12 17:19 seq
 crw-rw---- 1 root audio 116, 33 Aug 12 17:19 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.1-0ubuntu3.11
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
CurrentDmesg: [39221.196411] init: plymouth-upstart-bridge main process ended, respawning
DistroRelease: Ubuntu 14.04
Ec2AMI: ami-47a23a30
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: eu-west-1b
Ec2InstanceType: c4.8xlarge
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
IwConfig: Error: [Errno 2] No such file or directory
Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99
MachineType: Xen HVM domU
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-61-generic root=UUID=877c5daf-9427-49e3-8656-437c3f30fc84 ro console=tty1 console=ttyS0
ProcVersionSignature: User Name 3.13.0-61.100-generic 3.13.11-ckt22
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-61-generic N/A
 linux-backports-modules-3.13.0-61-generic N/A
 linux-firmware N/A
RfKill: Error: [Errno 2] No such file or directory
Tags: trusty ec2-images
Uname: Linux 3.13.0-61-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm audio cdrom dialout dip floppy netdev plugdev sudo video
_MarkForUpload: True
dmi.bios.date: 05/06/2015
dmi.bios.vendor: Xen
dmi.bios.version: 4.2.amazon
dmi.chassis.type: 1
dmi.chassis.vendor: Xen
dmi.modalias: dmi:bvnXen:bvr4.2.amazon:bd05/06/2015:svnXen:pnHVMdomU:pvr4.2.amazon:cvnXen:ct1:cvr:
dmi.product.name: HVM domU
dmi.product.version: 4.2.amazon
dmi.sys.vendor: Xen

affects: ubuntu-on-ec2 → linux-meta (Ubuntu)
Revision history for this message
Robert C Jennings (rcj) wrote :

Colin,

I read that you were considering[1] enabling it during t+1 and I wanted your opinion for enabling it in Trusty on EC2 images at this time. Also, what testing were you doing to evaluate this previously? Thanks.

[1] https://lists.ubuntu.com/archives/kernel-team/2014-April/040795.html

Brad Figg (brad-figg)
affects: linux-meta (Ubuntu) → linux (Ubuntu)
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1477171

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: trusty
Revision history for this message
Craig Watcham (craig-watcham) wrote : BootDmesg.txt

apport information

tags: added: apport-collected ec2-images
description: updated
Revision history for this message
Craig Watcham (craig-watcham) wrote : Lspci.txt

apport information

Revision history for this message
Craig Watcham (craig-watcham) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Craig Watcham (craig-watcham) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Craig Watcham (craig-watcham) wrote : ProcModules.txt

apport information

Revision history for this message
Craig Watcham (craig-watcham) wrote : UdevDb.txt

apport information

Revision history for this message
Craig Watcham (craig-watcham) wrote : UdevLog.txt

apport information

Revision history for this message
Craig Watcham (craig-watcham) wrote : WifiSyslog.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: Confirmed → In Progress
assignee: nobody → Colin Ian King (colin-king)
Revision history for this message
Colin Ian King (colin-king) wrote :

Enabling intel-pstate on the 3.13 kernel was deemed problematic for several reasons. However, I have backported all of the changes upto 4.2-rc8 to 3.13 and re-enabled the intel-pstate driver for testing. I have performed a boot smoke test with this kernel and I think it needs some deeper testing to see if it resolves your issues.

The kernel packages can be found in:

http://kernel.ubuntu.com/~cking/intel-pstate

Please can you see if these help.

Revision history for this message
Colin Ian King (colin-king) wrote :

Ping, any updates on this?

Revision history for this message
Craig Watcham (craig-watcham) wrote :

We have tested the patched kernel on one network intensive (high packet) workload with max_cstate=1 and ondemand governor disabled and it has lower network throughput than the default kernel had with intel_pstate=enable. There also appears to be a futex performance issue with the c4.8xlarge which we are still investigating and may be contributing to the problem.

Revision history for this message
Doug Smythies (dsmythies) wrote :

Isn't there another issue within this bug report? Why doesn't the acpi-cpufreq frequency scaling driver support the higher available clock frequencies?

Actually, maybe it does. Observe processor 3 from the cpuinfo listing:

processor : 3
...
cpu MHz : 2901.000

It is in turbo mode.
Regardless of what other tools say, what does turbostat report for CPU frequencies when using the acpi-cpufreq frequency scaling driver?

Revision history for this message
Craig Watcham (craig-watcham) wrote :

Agreed that this is an issue with acpi-cpufreq.

turbostat is broken on Trusty:
$ sudo turbostat
/dev/cpu/0/msr offset 0x641 read failed

But using Brendan Gregg's msr-cloud-tools [1] it seems we are actually running at max frequency (with intel_idle.max_cstate=0 processor.max_cstate=0 but without intel_pstate):

$ sudo ./showboost
CPU MHz : 2901
Turbo MHz : 3200 (10 active)
Turbo Ratio : 110% (10 active)
CPU 0 summary every 5 seconds...

TIME C0_MCYC C0_ACYC UTIL RATIO MHz
09:04:00 6554116 7236499 0% 110% 3203
09:04:05 4712681 5202133 0% 110% 3202
09:04:10 4209132 4646582 0% 110% 3202
09:04:15 4560950 5038256 0% 110% 3204

While cpupower still shows us running at 2901:
$ sudo cpupower frequency-info
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 10.0 us.
  hardware limits: 1.20 GHz - 2.90 GHz
  available frequency steps: 2.90 GHz, 2.90 GHz, 2.80 GHz, 2.70 GHz, 2.50 GHz, 2.40 GHz, 2.30 GHz, 2.20 GHz, 2.00 GHz, 1.90 GHz, 1.80 GHz, 1.70 GHz, 1.60 GHz, 1.40 GHz, 1.30 GHz, 1.20 GHz
  available cpufreq governors: conservative, ondemand, userspace, powersave, performance
  current policy: frequency should be within 1.20 GHz and 2.90 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency is 2.90 GHz (asserted by call to hardware).
  cpufreq stats: 2.90 GHz:100.00%, 2.90 GHz:0.00%, 2.80 GHz:0.00%, 2.70 GHz:0.00%, 2.50 GHz:0.00%, 2.40 GHz:0.00%, 2.30 GHz:0.00%, 2.20 GHz:0.00%, 2.00 GHz:0.00%, 1.90 GHz:0.00%, 1.80 GHz:0.00%, 1.70 GHz:0.00%, 1.60 GHz:0.00%, 1.40 GHz:0.00%, 1.30 GHz:0.00%, 1.20 GHz:0.00%
  boost state support:
    Supported: yes
    Active: yes

[1] https://github.com/brendangregg/msr-cloud-tools

Changed in linux (Ubuntu):
assignee: Colin Ian King (colin-king) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.