[SRU] Ubuntu instances on GCE should use NOOP scheduler

Bug #1420544 reported by Ben Howard
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned
systemd (Ubuntu)
Won't Fix
Undecided
Unassigned
Trusty
Fix Released
Medium
Unassigned

Bug Description

[IMPACT] By default, the Ubuntu kernel uses deadline. Google has identified that Cloud Workloads running on Ubuntu perform better using NOOP as the default scheduler. Google has requested that Google Cloud Compute (GCE) instances use NOOP as the default.

[FIX] Add udev rule for GCE devices to use NOOP by default.

[VALIDATION]
1. Boot Ubuntu instance on Google GCE
2. Confirm that the scheduler is deadline:
   $ cat /sys/block/sda/queue/scheduler
   noop [deadline] cfq
3. Install proposed udev package
4. Reboot
5. Confirm that schedule is now noop
   $ cat /sys/block/sda/queue/scheduler
  [noop] deadline cfq

[RISK] This patch will affect currently running instances and on reboot they should see better performance. However, there is a risk that some users will experience a performance hit.

[ORIGINAL REPORT]

Per Google's request, Ubuntu instances should use NOOP as the default scheduler.

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Since we are going to start rolling in all the cloud image secret saucy done via udev, I've added a new rule set to start with.

This change is confirmed to work.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1420544/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: patch
tags: added: bot-comment
Revision history for this message
Martin Pitt (pitti) wrote :

This is an unconditional rule which will just second-guess kernel defaults everywhere, and it will not go to upstream either. Wouldn't it make more sense to fix whichever kernel driver is responsible for this device (or possibly QEMU) to pick a scheduler which is appropriate for these virtual drives, instead of picking the wrong one and then requiring userspace to clean up behind it? This udev rule is ok for an SRU, but not for a long-term solution.

Also, the "*Google*" match is too weak. What if I attach an Android device in UMS mode, these should certainly not deviate from the kernel defaults.

Can you please give some rationale why the scheduler needs to be changed? What is "GCE"?

Thanks!

Changed in ubuntu:
status: Confirmed → Incomplete
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

GCE is "Google Compute Engine" which is their cloud virtual machine offering.

The rationale for the change is that Ubuntu by default uses deadline for 14.04+. Google has requested that we switch over to noop for performance reasons.

Would pegging the rule like so work better?
SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_VENDOR}=="*Google*", ENV{ID_PATH}=="*virtio-pci*", ATTR{queue/scheduler}="noop"

This rule should only work when the device is both Google and coming over the VirtIO channel.

The device is presented as:
$ sudo udevadm info /dev/sda
P: /devices/pci0000:00/0000:00:03.0/virtio0/host0/target0:0:1/0:0:1:0/block/sda
N: sda
S: disk/by-id/google-persistent-disk-0
S: disk/by-id/scsi-0Google_PersistentDisk_persistent-disk-0
S: disk/by-path/pci-0000:00:03.0-virtio-pci-virtio0-scsi-0:0:1:0
E: DEVLINKS=/dev/disk/by-id/google-persistent-disk-0 /dev/disk/by-id/scsi-0Google_PersistentDisk_persistent-disk-0 /dev/disk/by-path/pci-0000:00:03.0-virtio-pci-virtio0-scsi-0:0:1:0
E: DEVNAME=/dev/sda
E: DEVPATH=/devices/pci0000:00/0000:00:03.0/virtio0/host0/target0:0:1/0:0:1:0/block/sda
E: DEVTYPE=disk
E: ID_BUS=scsi
E: ID_MODEL=PersistentDisk
E: ID_MODEL_ENC=PersistentDisk\x20\x20
E: ID_PART_TABLE_TYPE=dos
E: ID_PATH=pci-0000:00:03.0-virtio-pci-virtio0-scsi-0:0:1:0
E: ID_PATH_TAG=pci-0000_00_03_0-virtio-pci-virtio0-scsi-0_0_1_0
E: ID_REVISION=1
E: ID_SCSI=1
E: ID_SERIAL=0Google_PersistentDisk_persistent-disk-0
E: ID_SERIAL_SHORT=persistent-disk-0
E: ID_TYPE=disk
E: ID_VENDOR=Google
E: ID_VENDOR_ENC=Google\x20\x20
E: MAJOR=8
E: MINOR=0
E: SUBSYSTEM=block
E: USEC_INITIALIZED=916

$ sudo udevadm info /dev/sdb
P: /devices/pci0000:00/0000:00:05.0/virtio1/host1/target1:0:1/1:0:1:0/block/sdb
N: sdb
S: disk/by-id/google-local-ssd-1
S: disk/by-id/scsi-0Google_EphemeralDisk_local-ssd-1
S: disk/by-path/pci-0000:00:05.0-virtio-pci-virtio1-scsi-0:0:1:0
E: DEVLINKS=/dev/disk/by-id/google-local-ssd-1 /dev/disk/by-id/scsi-0Google_EphemeralDisk_local-ssd-1 /dev/disk/by-path/pci-0000:00:05.0-virtio-pci-virtio1-scsi-0:0:1:0
E: DEVNAME=/dev/sdb
E: DEVPATH=/devices/pci0000:00/0000:00:05.0/virtio1/host1/target1:0:1/1:0:1:0/block/sdb
E: DEVTYPE=disk
E: ID_BUS=scsi
E: ID_MODEL=EphemeralDisk
E: ID_MODEL_ENC=EphemeralDisk\x20\x20\x20
E: ID_PATH=pci-0000:00:05.0-virtio-pci-virtio1-scsi-0:0:1:0
E: ID_PATH_TAG=pci-0000_00_05_0-virtio-pci-virtio1-scsi-0_0_1_0
E: ID_REVISION=1
E: ID_SCSI=1
E: ID_SERIAL=0Google_EphemeralDisk_local-ssd-1
E: ID_SERIAL_SHORT=local-ssd-1
E: ID_TYPE=disk
E: ID_VENDOR=Google
E: ID_VENDOR_ENC=Google\x20\x20
E: MAJOR=8
E: MINOR=16
E: SUBSYSTEM=block
E: USEC_INITIALIZED=59956

Revision history for this message
Martin Pitt (pitti) wrote :

That's a bit better indeed. But I still wonder why that cannot be in cloud-init or another cloud specific package? This is a rule that we basically have to keep forever, it's never going to apply (and just marginally slow down) on desktops or touch. And I'm not in a position to determine when this rule needs to be updated either.

Even better would of course be to change the virtio kernel driver to use the noop scheduler by default, if that's indeed better than deadline; this might then also benefit other cloud environments where the reason is presumably the same?

Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

+apw for comment on #5

Changing the virtio driver makes sense to me, however, I wonder if a wholesale change might be bad.

Revision history for this message
Martin Pitt (pitti) wrote :

The updated udev rule is of course fine for an SRU. Marked for trusty for now, please add more releases as desired. Keeping the linux task for vivid onwards to change the default in the virtio kernel driver itself (pending feedback from the kernel team).

affects: ubuntu → linux (Ubuntu)
Changed in systemd (Ubuntu):
status: New → Incomplete
Changed in systemd (Ubuntu Trusty):
status: New → Triaged
no longer affects: linux (Ubuntu Trusty)
Revision history for this message
Martin Pitt (pitti) wrote :

@Ben Howard: I'll upload that to trusty as soon as the current SRU gets verified and into trusty-updates.

Changed in systemd (Ubuntu Trusty):
assignee: nobody → Martin Pitt (pitti)
importance: Undecided → Medium
Revision history for this message
Martin Pitt (pitti) wrote :

I uploaded a trusty update for this to the SRU review queue. Please test the -proposed package once it is available to verify this. Thanks!

Changed in systemd (Ubuntu Trusty):
assignee: Martin Pitt (pitti) → nobody
status: Triaged → In Progress
Revision history for this message
Brian Murray (brian-murray) wrote : Missing SRU information

Thanks for uploading the fix for this bug report to -proposed. However, when reviewing the package in -proposed and the details of this bug report I noticed that the bug description is missing information required for the SRU process. You can find full details at http://wiki.ubuntu.com/StableReleaseUpdates#Procedure but essentially this bug is missing some of the following: a statement of impact, a test case and details regarding the regression potential. Thanks in advance!

tags: added: hw-specific
summary: - Ubuntu instances on GCE should use NOOP scheduler
+ [SRU] Ubuntu instances on GCE should use NOOP scheduler
description: updated
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Sorry about missing the SRU info. Added.

Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Ben, or anyone else affected,

Accepted systemd into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/204-5ubuntu20.11 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in systemd (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

Confirmed -proposed. Marking as validation-done.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Martin Pitt (pitti) wrote :

Updating status for vivid tentatively, until the kernel team can comment.

Changed in systemd (Ubuntu):
status: Incomplete → Won't Fix
Changed in linux (Ubuntu):
status: Incomplete → New
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1420544

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Martin Pitt (pitti)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 204-5ubuntu20.11

---------------
systemd (204-5ubuntu20.11) trusty; urgency=medium

  [ Ben Howard ]
  * Add debian/extra/rules/62-google-cloudimg.rules: Use "noop" scheduler for
    Google virtio drives. (LP: #1420544)

  [ Martin Pitt ]
  * Add upstream-ignore-mmcrpmb.patch: Fix /dev/disk/by-path/ symlink of mmc
    RPMB partitions and don't blkid them to avoid kernel buffer I/O errors and
    timeouts. (LP: #1333140)
 -- Martin Pitt <email address hidden> Wed, 18 Feb 2015 12:11:49 +0100

Changed in systemd (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Adam Conrad (adconrad) wrote : Update Released

The verification of the Stable Release Update for systemd has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Changed in linux (Ubuntu):
assignee: Ben Howard (utlemming) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.