Bug with IO scheduler in kernel

Bug #555067 reported by Christian Roessner
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

I use LVM for KVM. If starting the guests, the kernel prints lots of error messages and the guest do not really start. All was working on Karmic. Introduced with Lucid.

I recognized that if I start the system with init=/bin/bash and make a vgscan and afterwards a vgchange -ay, it takes endless long to complete. Sometimes not all LVs are found, even lvs prints them out. So I also have to do a lvscna and lvchange on individual LVs, too.

At the moment there is no chance to start KVM again.

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image (not installed)
Regression: Yes
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.32-19.28-generic 2.6.32.10+drm33.1
Uname: Linux 2.6.32-19-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.21.
AplayDevices: Error: [Errno 2] No such file or directory
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D3p', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info: Error: [Errno 2] No such file or directory
Card0.Amixer.values: Error: [Errno 2] No such file or directory
Date: Sun Apr 4 12:04:23 2010
HibernationDevice: RESUME=UUID=1877fe70-89e5-4d1f-9d66-8efab73615ec
MachineType: MICRO-STAR INTERANTIONAL CO.,LTD MS-7368
ProcCmdLine: root=UUID=0d27271c-feaa-40d9-bbbd-baff4ca1d3cc ro vga=normal elevator=anticipatory
ProcEnviron:
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
RelatedPackageVersions: linux-firmware 1.33
RfKill:

SourcePackage: linux
dmi.bios.date: 10/31/2007
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: V1.5B2
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: MS-7368
dmi.board.vendor: MICRO-STAR INTERANTIONAL CO.,LTD
dmi.board.version: 1.0
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: To Be Filled By O.E.M.
dmi.chassis.version: To Be Filled By O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrV1.5B2:bd10/31/2007:svnMICRO-STARINTERANTIONALCO.,LTD:pnMS-7368:pvr1.0:rvnMICRO-STARINTERANTIONALCO.,LTD:rnMS-7368:rvr1.0:cvnToBeFilledByO.E.M.:ct3:cvrToBeFilledByO.E.M.:
dmi.product.name: MS-7368
dmi.product.version: 1.0
dmi.sys.vendor: MICRO-STAR INTERANTIONAL CO.,LTD

Revision history for this message
Christian Roessner (christian-roessner-net) wrote :
Revision history for this message
Christian Roessner (christian-roessner-net) wrote :
Download full text (12.9 KiB)

I remembered one thing I changed some days ago. Cahnging the default io scheduler from cfq to anticipatory. With the latter one, it was impossible to resync the software raid1 md3, as you can see in dmesg logs. Changed it back to defaults and waited for the raid to be synced again. After that started the kvm guests again. But still get lot of kernel messages:

See:

[ 248.800024] Clocksource tsc unstable (delta = -270012333 ns)
[ 6720.520038] INFO: task flush-9:2:454 blocked for more than 120 seconds.
[ 6720.524331] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6720.530099] flush-9:2 D 0000000000000000 0 454 2 0x00000000
[ 6720.530109] ffff8801938578d0 0000000000000046 0000000000015b80 0000000000015b80
[ 6720.530118] ffff880193859ab0 ffff880193857fd8 0000000000015b80 ffff8801938596f0
[ 6720.530126] 0000000000015b80 ffff880193857fd8 0000000000015b80 ffff880193859ab0
[ 6720.530134] Call Trace:
[ 6720.530151] [<ffffffff8116c730>] ? sync_buffer+0x0/0x50
[ 6720.530161] [<ffffffff8153e697>] io_schedule+0x47/0x70
[ 6720.530168] [<ffffffff8116c775>] sync_buffer+0x45/0x50
[ 6720.530175] [<ffffffff8153ed9a>] __wait_on_bit_lock+0x5a/0xc0
[ 6720.530182] [<ffffffff8116c730>] ? sync_buffer+0x0/0x50
[ 6720.530189] [<ffffffff8116cb20>] ? end_buffer_async_write+0x0/0x180
[ 6720.530196] [<ffffffff8153ee78>] out_of_line_wait_on_bit_lock+0x78/0x90
[ 6720.530205] [<ffffffff81085340>] ? wake_bit_function+0x0/0x40
[ 6720.530212] [<ffffffff8116c8f6>] __lock_buffer+0x36/0x40
[ 6720.530219] [<ffffffff8116d644>] __block_write_full_page+0x374/0x3a0
[ 6720.530227] [<ffffffff810f39e7>] ? unlock_page+0x27/0x30
[ 6720.530234] [<ffffffff8116cb20>] ? end_buffer_async_write+0x0/0x180
[ 6720.530241] [<ffffffff8116cb20>] ? end_buffer_async_write+0x0/0x180
[ 6720.530249] [<ffffffff8116dfd0>] block_write_full_page_endio+0xe0/0x120
[ 6720.530256] [<ffffffff8116cb20>] ? end_buffer_async_write+0x0/0x180
[ 6720.530263] [<ffffffff8116e025>] block_write_full_page+0x15/0x20
[ 6720.530271] [<ffffffff811b636d>] ext3_ordered_writepage+0x1dd/0x200
[ 6720.530279] [<ffffffff810fb907>] __writepage+0x17/0x40
[ 6720.530287] [<ffffffff810fcac7>] write_cache_pages+0x227/0x4d0
[ 6720.530294] [<ffffffff810fb8f0>] ? __writepage+0x0/0x40
[ 6720.530302] [<ffffffff810fcd94>] generic_writepages+0x24/0x30
[ 6720.530309] [<ffffffff810fcdd5>] do_writepages+0x35/0x40
[ 6720.530315] [<ffffffff81164b66>] writeback_single_inode+0xf6/0x3d0
[ 6720.530322] [<ffffffff811657d0>] writeback_inodes_wb+0x410/0x5e0
[ 6720.530328] [<ffffffff81165aaa>] wb_writeback+0x10a/0x1d0
[ 6720.530335] [<ffffffff81077895>] ? try_to_del_timer_sync+0x75/0xd0
[ 6720.530342] [<ffffffff8153eb7b>] ? schedule_timeout+0x19b/0x300
[ 6720.530348] [<ffffffff81165ddc>] wb_do_writeback+0x18c/0x1a0
[ 6720.530355] [<ffffffff81165e43>] bdi_writeback_task+0x53/0xe0
[ 6720.530363] [<ffffffff8110e726>] bdi_start_fn+0x86/0x100
[ 6720.530369] [<ffffffff8110e6a0>] ? bdi_start_fn+0x0/0x100
[ 6720.530375] [<ffffffff81084f86>] kthread+0x96/0xa0
[ 6720.530383] [<ffffffff810141ea>] child_rip+0xa/0x20
[ 6720.530389] [<ffffffff81084ef0>] ? kthread+0x0/0xa0
[ 6720.530395] [<fffffff...

summary: - Impossible to start KVM from LVM volumes
+ Bug with IO scheduler in kernel
Revision history for this message
Alvin (alvind) wrote :

I have this problem on karmic. It happens from time to time, but not everyday. Is there a way to reproduce this reliably?

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Christian,

If you could also please test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

This bug report was marked as Incomplete and has not had any updated comments for quite some time. As a result this bug is being closed. Please reopen if this is still an issue in the current Ubuntu release http://www.ubuntu.com/getubuntu/download . Also, please be sure to provide any requested information that may have been missing. To reopen the bug, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-expired
Changed in linux (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Christian Roessner (christian-roessner-net) wrote :

I installed LTS with server-kernel and deactivated the mdadm --monitor feature, which runs every month. So the raid is hopefully unchked and untouched. It is my productive server, so I do not want to test a mainline kernel.

Maybe 5-6 kvm guests on a dual core AMD seems to be a little bit too much. At least for I/O. But should not break the host kernel.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.