Activity log for bug #335097

Date Who What changed Old value New value Message
2009-02-26 19:55:03 Etienne Goyer bug added bug
2009-02-26 19:55:03 Etienne Goyer bug added attachment 'dmesg-1.txt' (dmesg-1.txt)
2009-02-26 19:55:29 Etienne Goyer bug added attachment 'dmesg-2.txt' (dmesg-2.txt)
2009-02-26 19:57:06 Etienne Goyer bug added subscriber Stefan Bader
2009-02-26 19:58:04 Dustin Kirkland  linux: status New In Progress
2009-02-26 19:58:04 Dustin Kirkland  linux: assignee stefan-bader-canonical
2009-02-26 19:58:04 Dustin Kirkland  linux: importance Undecided High
2009-02-26 19:58:04 Dustin Kirkland  linux: statusexplanation
2009-02-26 20:11:53 Stefan Bader description When running a specific software validation suite in a KVM guest (both guest and host running hardy) for over 24 hours, the guest will eventually freeze and the host will have the following oops in dmesg: 75243.174934] Unable to handle kernel paging request at 0000000000100100 RIP: [75243.174947] [<ffffffff882cc545>] :kvm:kvm_mmu_slot_remove_write_access+0x55/0x70 [75243.174992] PGD 75072d067 PUD 76b738067 PMD 0 [75243.174997] Oops: 0000 [2] SMP [75243.175001] CPU 4 [75243.175003] Modules linked in: tun bridge af_packet kqemu radeon drm rfcomm l2cap bluetooth kvm_intel kvm ppdev cpufreq_ondemand cpufreq_powersave cpufreq _conservative cpufreq_userspace cpufreq_stats freq_table sbs sbshc container video output dock battery iptable_filter ip_tables x_tables ac parport_pc lp par port ipv6 joydev serio_raw evdev psmouse pcspkr i2c_piix4 i2c_core button ext3 jbd mbcache sg sr_mod sd_mod cdrom ata_generic pata_acpi usbhid hid qla2xxx pa ta_serverworks scsi_transport_fc aacraid ehci_hcd libata scsi_tgt ohci_hcd tg3 scsi_mod usbcore thermal processor fan fbcon tileblit font bitblit softcursor fuse [75243.175073] Pid: 7220, comm: kvm Tainted: G D 2.6.24-23-generic #1 [75243.175076] RIP: 0010:[<ffffffff882cc545>] [<ffffffff882cc545>] :kvm:kvm_mmu_slot_remove_write_access+0x55/0x70 [75243.175090] RSP: 0018:ffff81074a48be20 EFLAGS: 00010246 [75243.175092] RAX: 0000000000000000 RBX: ffff81072051c000 RCX: 00007fff48fe3ca0 [75243.175094] RDX: 0000000000100100 RSI: 0000000000000005 RDI: ffff81072051eaf0 [75243.175097] RBP: ffff81074a48be88 R08: 0000000000000000 R09: 0000000000100100 [75243.175099] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [75243.175101] R13: ffff81072051c020 R14: 000000004010ae42 R15: 0000000000000000 [75243.175104] FS: 00007f5240fd26e0(0000) GS:ffff81081e0d7300(0000) knlGS:0000000000000000 [75243.175107] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [75243.175109] CR2: 0000000000100100 CR3: 000000076b640000 CR4: 00000000000026e0 [75243.175112] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [75243.175115] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [75243.175118] Process kvm (pid: 7220, threadinfo ffff81074a48a000, task ffff81081c2087f0) [75243.175119] Stack: ffffffff882c80f2 0000000000000000 ffff8107fa6177c8 000000011c2087f0 [75243.175126] ffff81072051c000 ffff81074a48be88 000000004010ae42 0000000000000008 [75243.175131] ffffffff882c5771 ffff8106eb3c6168 0000000000000000 0000000000000292 [75243.175135] Call Trace: [75243.175148] [<ffffffff882c80f2>] :kvm:kvm_vm_ioctl_get_dirty_log+0x82/0xc0 [75243.175174] [<ffffffff882c5771>] :kvm:kvm_vm_ioctl+0xd1/0x200 [75243.175192] [<ffffffff80240920>] do_wait+0x4e0/0xcf0 [75243.175224] [<ffffffff802c0c0f>] do_ioctl+0x2f/0xa0 [75243.175235] [<ffffffff802c0ea0>] vfs_ioctl+0x220/0x2c0 [75243.175254] [<ffffffff802c0fd1>] sys_ioctl+0x91/0xb0 [75243.175274] [<ffffffff8020c39e>] system_call+0x7e/0x83 [75243.175307] [75243.175308] [75243.175309] Code: 49 8b 11 49 39 f9 0f 18 0a 75 b9 f3 c3 66 66 66 66 66 2e 0f [75243.175322] RIP [<ffffffff882cc545>] :kvm:kvm_mmu_slot_remove_write_access+0x55/0x70 [75243.175334] RSP <ffff81074a48be20> [75243.175336] CR2: 0000000000100100 [75243.175343] ---[ end trace 01e4e553c58023ce ]--- The guest freeze and host oops above is reproducible, with sensibly the same trace in dmesg. Attached two different dmesg output from two different run of the load test on the same machine. SRU justification: Impact: The function kvm_mmu_remove_write_access() runs under the slots_lock protection but the list it walks can be modified by other codepaths using the mmu_lock. This causes the host to Oops and the guests will hang. Fix: Patch backported from upstream to add mmu_lock protection around the list walk. Testcase: Running the validation suite for a longer period of time (24hrs). When running a specific software validation suite in a KVM guest (both guest and host running hardy) for over 24 hours, the guest will eventually freeze and the host will have the following oops in dmesg: 75243.174934] Unable to handle kernel paging request at 0000000000100100 RIP: [75243.174947] [<ffffffff882cc545>] :kvm:kvm_mmu_slot_remove_write_access+0x55/0x70 [75243.174992] PGD 75072d067 PUD 76b738067 PMD 0 [75243.174997] Oops: 0000 [2] SMP [75243.175001] CPU 4 [75243.175003] Modules linked in: tun bridge af_packet kqemu radeon drm rfcomm l2cap bluetooth kvm_intel kvm ppdev cpufreq_ondemand cpufreq_powersave cpufreq _conservative cpufreq_userspace cpufreq_stats freq_table sbs sbshc container video output dock battery iptable_filter ip_tables x_tables ac parport_pc lp par port ipv6 joydev serio_raw evdev psmouse pcspkr i2c_piix4 i2c_core button ext3 jbd mbcache sg sr_mod sd_mod cdrom ata_generic pata_acpi usbhid hid qla2xxx pa ta_serverworks scsi_transport_fc aacraid ehci_hcd libata scsi_tgt ohci_hcd tg3 scsi_mod usbcore thermal processor fan fbcon tileblit font bitblit softcursor fuse [75243.175073] Pid: 7220, comm: kvm Tainted: G D 2.6.24-23-generic #1 [75243.175076] RIP: 0010:[<ffffffff882cc545>] [<ffffffff882cc545>] :kvm:kvm_mmu_slot_remove_write_access+0x55/0x70 [75243.175090] RSP: 0018:ffff81074a48be20 EFLAGS: 00010246 [75243.175092] RAX: 0000000000000000 RBX: ffff81072051c000 RCX: 00007fff48fe3ca0 [75243.175094] RDX: 0000000000100100 RSI: 0000000000000005 RDI: ffff81072051eaf0 [75243.175097] RBP: ffff81074a48be88 R08: 0000000000000000 R09: 0000000000100100 [75243.175099] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [75243.175101] R13: ffff81072051c020 R14: 000000004010ae42 R15: 0000000000000000 [75243.175104] FS: 00007f5240fd26e0(0000) GS:ffff81081e0d7300(0000) knlGS:0000000000000000 [75243.175107] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [75243.175109] CR2: 0000000000100100 CR3: 000000076b640000 CR4: 00000000000026e0 [75243.175112] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [75243.175115] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [75243.175118] Process kvm (pid: 7220, threadinfo ffff81074a48a000, task ffff81081c2087f0) [75243.175119] Stack: ffffffff882c80f2 0000000000000000 ffff8107fa6177c8 000000011c2087f0 [75243.175126] ffff81072051c000 ffff81074a48be88 000000004010ae42 0000000000000008 [75243.175131] ffffffff882c5771 ffff8106eb3c6168 0000000000000000 0000000000000292 [75243.175135] Call Trace: [75243.175148] [<ffffffff882c80f2>] :kvm:kvm_vm_ioctl_get_dirty_log+0x82/0xc0 [75243.175174] [<ffffffff882c5771>] :kvm:kvm_vm_ioctl+0xd1/0x200 [75243.175192] [<ffffffff80240920>] do_wait+0x4e0/0xcf0 [75243.175224] [<ffffffff802c0c0f>] do_ioctl+0x2f/0xa0 [75243.175235] [<ffffffff802c0ea0>] vfs_ioctl+0x220/0x2c0 [75243.175254] [<ffffffff802c0fd1>] sys_ioctl+0x91/0xb0 [75243.175274] [<ffffffff8020c39e>] system_call+0x7e/0x83 [75243.175307] [75243.175308] [75243.175309] Code: 49 8b 11 49 39 f9 0f 18 0a 75 b9 f3 c3 66 66 66 66 66 2e 0f [75243.175322] RIP [<ffffffff882cc545>] :kvm:kvm_mmu_slot_remove_write_access+0x55/0x70 [75243.175334] RSP <ffff81074a48be20> [75243.175336] CR2: 0000000000100100 [75243.175343] ---[ end trace 01e4e553c58023ce ]--- The guest freeze and host oops above is reproducible, with sensibly the same trace in dmesg. Attached two different dmesg output from two different run of the load test on the same machine.
2009-02-27 08:48:16 Stefan Bader linux: status New Fix Committed
2009-02-27 08:48:16 Stefan Bader linux: assignee stefan-bader-canonical
2009-02-27 08:48:16 Stefan Bader linux: importance Undecided High
2009-02-27 08:48:16 Stefan Bader linux: statusexplanation Commited to Hardy as http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-hardy.git;a=commitdiff;h=4caefc5a810ac9653222bdfe2e4b807505f4ea32
2009-02-27 11:01:24 Stefan Bader linux: status New Fix Committed
2009-02-27 11:01:24 Stefan Bader linux: assignee stefan-bader-canonical
2009-02-27 11:01:24 Stefan Bader linux: importance Undecided High
2009-02-27 11:01:24 Stefan Bader linux: statusexplanation Commited to Intrepid as http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-intrepid.git;a=commitdiff;h=fec83329de1bf700ee3cb3b7b49ace1d98a951ed
2009-02-27 11:01:48 Stefan Bader linux: status In Progress Invalid
2009-02-27 11:01:48 Stefan Bader linux: statusexplanation Jaunty not affected.
2009-02-27 11:02:07 Stefan Bader bug added subscriber Ubuntu Stable Release Updates Team
2009-05-04 09:35:17 Launchpad Janitor linux (Ubuntu Hardy): status Fix Committed Fix Released
2009-05-04 09:35:17 Launchpad Janitor cve linked 2008-4307
2009-05-04 09:35:17 Launchpad Janitor cve linked 2008-6107
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-0028
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-0031
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-0065
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-0269
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-0322
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-0675
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-0676
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-0745
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-0746
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-0834
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-0835
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-0859
2009-05-04 09:35:17 Launchpad Janitor cve linked 2009-1046
2009-05-12 16:09:24 Launchpad Janitor linux (Ubuntu Intrepid): status Fix Committed Fix Released
2009-05-12 16:09:24 Launchpad Janitor cve linked 2009-0029
2009-05-12 16:09:24 Launchpad Janitor cve linked 2009-0605
2009-05-12 16:09:24 Launchpad Janitor cve linked 2009-0747
2009-05-12 16:09:24 Launchpad Janitor cve linked 2009-0748