Ubuntu
linux package

Hung tasks on UEC cloud images with EBS volumes

Bug #808872 reported by Ben Howard on 2011-07-11

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	linux (Ubuntu)	Confirmed	Undecided	Unassigned

Bug Description

On the UEC images on Amazon, from time to time people see hung task for more than 120 seconds messages. From previous experience with Amazon, EXT4 file systems are prone to these messages due to the way that the virtual (/dev/xvd*) EBS disks are presented to the DomU and EXT4's delayed commits. EBS volumes are presented to DomU's as physical disks that are attached in Dom0; the actual disk is a network device. During periods of high I/O, flushing of dirty pages can result in the hung tasks while the flushing to the network disk happens.

Generally, this affects m2.* and cc1.4xlarge instance types (the expensive premium instances). Adjusting these the "vm.dirty_background_ratio" and "vm.dirty_expire_centisec" have show the ability to mitigate these symptoms. The problem, however, is that adjusting these values can result in poor system performance depending on the workload and the instance type. For example, on a m2.4xlarge which has 72G of RAM, the number of dirty pages can be significantly bigger than a t1.micro which only has 604M of RAM.

This bug has been filed to see about getting guidance for the community from the kernel team on tuning of vm.dirty* settings to prevent hung tasks.

vm.dirty_background_ratio = 10
vm.dirty_background_bytes = 0
vm.dirty_ratio = 20
vm.dirty_bytes = 0
vm.dirty_writeback_centisecs = 500
vm.dirty_expire_centisecs = 3000
vm.drop_caches = 0

---

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: linux-image-virtual 2.6.38.8.22
ProcVersionSignature: User Name 2.6.38-8.42-virtual 2.6.38.2
Uname: Linux 2.6.38-8-virtual x86_64
AlsaDevices:
total 0
crw------- 1 root root 116, 1 2011-07-06 21:41 seq
crw------- 1 root root 116, 33 2011-07-06 21:41 timer
AplayDevices: Error: [Errno 2] No such file or directory
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
CurrentDmesg: [ 21.600011] eth0: no IPv6 routers present
Date: Mon Jul 11 16:01:31 2011
Ec2AMI: ami-6463980d
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1d
Ec2InstanceType: t1.micro
Ec2Kernel: aki-427d952b
Ec2Ramdisk: unavailable
Lspci:

Lsusb: Error: command ['lsusb'] failed with exit code 1:
ProcEnviron:
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcKernelCmdLine: root=LABEL=uec-rootfs ro console=hvc0
ProcModules: acpiphp 24097 0 - Live 0x0000000000000000
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh insta
---
AlsaDevices:
total 0
crw------- 1 root root 116, 1 2011-07-25 20:32 seq
crw------- 1 root root 116, 33 2011-07-25 20:32 timer
AplayDevices: Error: [Errno 2] No such file or directory
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
CurrentDmesg:

DistroRelease: Ubuntu 11.04
Ec2AMI: ami-c55b9cac
Ec2AMIManifest: (unknown)
Ec2AvailabilityZone: us-east-1a
Ec2InstanceType: t1.micro
Ec2Kernel: aki-427d952b
Ec2Ramdisk: unavailable
Lspci:

Lsusb: Error: command ['lsusb'] failed with exit code 1:
Package: linux (not installed)
ProcEnviron:
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcKernelCmdLine: root=LABEL=uec-rootfs ro console=hvc0
ProcModules: acpiphp 24097 0 - Live 0x0000000000000000
ProcVersionSignature: User Name 2.6.38-10.46-virtual 2.6.38.7
Tags: natty ec2-images
Uname: Linux 2.6.38-10-virtual x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm admin audio cdrom dialout dip floppy plugdev video

See original description

Tags:

Revision history for this message

Ben Howard (darkmuggle-deactivatedaccount) wrote on 2011-07-11:

BootDmesg.txt Edit (13.0 KiB, text/plain; charset="utf-8")
Dependencies.txt Edit (2.0 KiB, text/plain; charset="utf-8")
ProcCpuinfo.txt Edit (637 bytes, text/plain; charset="utf-8")
ProcCpuinfo_.txt Edit (637 bytes, text/plain; charset="utf-8")
ProcInterrupts.txt Edit (988 bytes, text/plain; charset="utf-8")
UdevDb.txt Edit (32.3 KiB, text/plain; charset="utf-8")
UdevLog.txt Edit (78.9 KiB, text/plain; charset="utf-8")

summary:

- Hung tasks on UEC cloud images
+ Hung tasks on UEC cloud images with EBS volumes

Ben Howard (darkmuggle-deactivatedaccount) on 2011-07-11

description:

updated

Ben Howard (darkmuggle-deactivatedaccount) on 2011-07-11

description:

updated

Revision history for this message

Stefan Bader (smb) wrote on 2011-07-13:

Having I/O created faster than the storage is able to cope with generally gets the system at some point. I am not sure I missed it or there is actually nothing, but question is what values have been tested to change? And maybe there is no good value to handle large memory systems and small ones.

Otherwise, yeah, it could be worth adjusting dirty_backround_ration downwards (to start backgroud writeout sooner, though if the percentage is too small on small systems, writes get potentially more fragmented) and move the dirty_ratio up (to get a bigger window until processes start to get waiting for flushed I/O).

Revision history for this message

Brad Figg (brad-figg) wrote on 2011-07-19: Missing required logs.

This bug is missing log files that will aid in dianosing the problem. From a terminal window please run:

apport-collect 808872

and then change the status of the bug back to 'New'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status:	New → Incomplete

Revision history for this message

Ben Howard (darkmuggle-deactivatedaccount) wrote on 2011-07-25: BootDmesg.txt

BootDmesg.txt Edit (13.5 KiB, text/plain)

apport information

tags:	added: apport-collected
description:	updated

Revision history for this message

Ben Howard (darkmuggle-deactivatedaccount) wrote on 2011-07-25: ProcCpuinfo.txt

ProcCpuinfo.txt Edit (593 bytes, text/plain)

apport information

Revision history for this message

Ben Howard (darkmuggle-deactivatedaccount) wrote on 2011-07-25: ProcCpuinfo_.txt

ProcCpuinfo_.txt Edit (593 bytes, text/plain)

apport information

Revision history for this message

Ben Howard (darkmuggle-deactivatedaccount) wrote on 2011-07-25: ProcInterrupts.txt

ProcInterrupts.txt Edit (988 bytes, text/plain)

apport information

Revision history for this message

Ben Howard (darkmuggle-deactivatedaccount) wrote on 2011-07-25: UdevDb.txt

UdevDb.txt Edit (32.3 KiB, text/plain)

apport information

Revision history for this message

Ben Howard (darkmuggle-deactivatedaccount) wrote on 2011-07-25: UdevLog.txt

UdevLog.txt Edit (78.9 KiB, text/plain)

apport information

Changed in linux (Ubuntu):
status:	Incomplete → New

Revision history for this message

Ben Howard (darkmuggle-deactivatedaccount) wrote on 2011-07-25:

#10

Added apport information and reset status to new per comment 3.

Revision history for this message

Brad Figg (brad-figg) wrote on 2011-07-25: Missing required logs.

#11

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 808872

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status:	New → Incomplete

Revision history for this message

Ben Howard (darkmuggle-deactivatedaccount) wrote on 2011-07-25:

#12

Logs files were added.

Changed in linux (Ubuntu):
status:	Incomplete → Confirmed

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Ubuntulinux package

Hung tasks on UEC cloud images with EBS volumes

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package