GPF in ksmd cmp_and_merge_page flow

Bug #906159 reported by Shyam
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Medium
Unassigned

Bug Description

We hit upon this GPF in oneiric 3.0.0-14-server kernel. Just before the upgrade to oneiric we were running natty (2.6.38-8-server) & we never hit this one.

Dec 16 21:23:18 ccslave kernel: [178781.743302] general protection fault: 0000 [#1] SMP
Dec 16 21:23:18 ccslave kernel: [178781.744129] CPU 5
Dec 16 21:23:18 ccslave kernel: [178781.744442] Modules linked in: pci_stub ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables kvm_intel kvm drbd lru_cache vesafb ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core nfsd ib_addr nfs scst_vdisk iscsi_tcp libiscsi_tcp iscsi_scst libiscsi scsi_transport_iscsi lockd scst fscache auth_rpcgss nfs_acl sunrpc libcrc32c dm_iostat bridge stp joydev usbhid psmouse hid serio_raw i7core_edac edac_core ghes dm_multipath acpi_power_meter hed dcdbas lp parport ses enclosure ixgbevf ixgbe dca megaraid_sas bnx2 mdio
Dec 16 21:23:18 ccslave kernel: [178781.754220]
Dec 16 21:23:18 ccslave kernel: [178781.754445] Pid: 93, comm: ksmd Not tainted 3.0.0-14-server #23-Ubuntu Dell Inc. PowerEdge R510/0DPRKF
Dec 16 21:23:18 ccslave kernel: [178781.755968] RIP: 0010:[<ffffffff8114e7a7>] [<ffffffff8114e7a7>] unstable_tree_search_insert+0x37/0x150
Dec 16 21:23:18 ccslave kernel: [178781.757386] RSP: 0018:ffff88061a857df0 EFLAGS: 00010246
Dec 16 21:23:18 ccslave kernel: [178781.758201] RAX: 000088057dfc50a8 RBX: ffffea0004205fe0 RCX: 0000000000000009
Dec 16 21:23:18 ccslave kernel: [178781.759344] RDX: ffff88061a857fd8 RSI: ffff8802ffac6000 RDI: ffffea000a7edb50
Dec 16 21:23:18 ccslave kernel: [178781.760432] RBP: ffff88061a857e30 R08: 00000000000000c2 R09: 00000000fffffffe
Dec 16 21:23:18 ccslave kernel: [178781.761323] R10: ffff88012dd24000 R11: 0000000000000001 R12: ffff880577acd080
Dec 16 21:23:18 ccslave kernel: [178781.762223] R13: ffffea000a7edb50 R14: ffff880315043ff8 R15: ffff880315043fc0
Dec 16 21:23:18 ccslave kernel: [178781.763228] FS: 0000000000000000(0000) GS:ffff88032fc40000(0000) knlGS:0000000000000000
Dec 16 21:23:18 ccslave kernel: [178781.764458] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 16 21:23:18 ccslave kernel: [178781.765305] CR2: 00007f1acad28000 CR3: 0000000001c03000 CR4: 00000000000026e0
Dec 16 21:23:18 ccslave kernel: [178781.766427] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 16 21:23:18 ccslave kernel: [178781.767534] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 16 21:23:18 ccslave kernel: [178781.768659] Process ksmd (pid: 93, threadinfo ffff88061a856000, task ffff88061c729720)
Dec 16 21:23:18 ccslave kernel: [178781.769937] Stack:
Dec 16 21:23:18 ccslave kernel: [178781.770108] ffff880315043fe8 ffff88061a857e48 ffff88061a857e30 ffffea001359fb18
Dec 16 21:23:18 ccslave kernel: [178781.770545] ffffea0004205fe0 ffff880577acd080 0000000000000000 ffff88061c729720
Dec 16 21:23:18 ccslave kernel: [178781.771483] ffff88061a857e70 ffffffff8114f690 0000000000000001 0000000000000000
Dec 16 21:23:18 ccslave kernel: [178781.772667] Call Trace:
Dec 16 21:23:18 ccslave kernel: [178781.773046] [<ffffffff8114f690>] cmp_and_merge_page+0x160/0x260
Dec 16 21:23:18 ccslave kernel: [178781.773964] [<ffffffff8114f83f>] ksm_scan_thread+0xaf/0x2a0
Dec 16 21:23:18 ccslave kernel: [178781.774839] [<ffffffff81081660>] ? add_wait_queue+0x60/0x60
Dec 16 21:23:18 ccslave kernel: [178781.775698] [<ffffffff8114f790>] ? cmp_and_merge_page+0x260/0x260
Dec 16 21:23:18 ccslave kernel: [178781.776651] [<ffffffff81080bbc>] kthread+0x8c/0xa0
Dec 16 21:23:18 ccslave kernel: [178781.817247] [<ffffffff81609164>] kernel_thread_helper+0x4/0x10
Dec 16 21:23:18 ccslave kernel: [178781.817252] [<ffffffff81080b30>] ? flush_kthread_worker+0xa0/0xa0
Dec 16 21:23:18 ccslave kernel: [178781.817257] [<ffffffff81609160>] ? gs_change+0x13/0x13
Dec 16 21:23:18 ccslave kernel: [178781.817260] Code: 53 48 83 ec 18 66 66 66 66 90 49 c7 c6 40 98 ee 81 48 89 55 c8 49 89 fc 48 89 f3 31 d2 49 83 3e 00 74 71 e8 fc f5 4a 00 49 8b 06
Dec 16 21:23:18 ccslave kernel: [178781.817282] RIP [<ffffffff8114e7a7>] unstable_tree_search_insert+0x37/0x150
Dec 16 21:23:18 ccslave kernel: [178781.817287] RSP <ffff88061a857df0>
Dec 16 21:23:18 ccslave kernel: [178782.070994] ---[ end trace ce317bb6c729a57b ]---

ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: linux-image-3.0.0-14-server 3.0.0-14.23
ProcVersionSignature: Ubuntu 3.0.0-14.23-server 3.0.9
Uname: Linux 3.0.0-14-server x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 2011-12-19 06:30 seq
 crw-rw---- 1 root audio 116, 33 2011-12-19 06:30 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 1.23-0ubuntu4
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
Date: Mon Dec 19 06:49:20 2011
HibernationDevice: RESUME=UUID=a2f4cf63-69ef-4c37-8fdf-6d6919274aae
InstallationMedia: Ubuntu-Server 11.04 "Natty Narwhal" - Release amd64 (20110426)
MachineType: Dell Inc. PowerEdge R510
PciMultimedia:

ProcEnviron:
 LANGUAGE=en_US:en
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.0.0-14-server root=UUID=a343c1c4-8b09-4f66-9ba4-838f6b60e7e7 ro quiet intel_iommu=on ixgbe.max_vfs=10
RelatedPackageVersions:
 linux-restricted-modules-3.0.0-14-server N/A
 linux-backports-modules-3.0.0-14-server N/A
 linux-firmware 1.60
RfKill: Error: [Errno 2] No such file or directory
SourcePackage: linux
UpgradeStatus: Upgraded to oneiric on 2011-12-12 (6 days ago)
dmi.bios.date: 10/25/2010
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 1.5.3
dmi.board.name: 0DPRKF
dmi.board.vendor: Dell Inc.
dmi.board.version: A03
dmi.chassis.type: 23
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr1.5.3:bd10/25/2010:svnDellInc.:pnPowerEdgeR510:pvr:rvnDellInc.:rn0DPRKF:rvrA03:cvnDellInc.:ct23:cvr:
dmi.product.name: PowerEdge R510
dmi.sys.vendor: Dell Inc.

Revision history for this message
Shyam (shyam-zadarastorage) wrote :
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

As was requested in the other bugs you reported, would it be possible to test the mainline kernel for each bug? Then post the test results?

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: needs-upstream-testing
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Shyam (shyam-zadarastorage) wrote :

due to instability with oneiric we moved back to Natty. Marking this bug 'kernel-unable-to-test-upstream'

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
Chris J Arges (arges)
Changed in linux (Ubuntu):
status: Expired → In Progress
assignee: nobody → Chris J Arges (christopherarges)
Revision history for this message
Chris J Arges (arges) wrote :

This seems to be related to the following upstream bugs:
https://bugzilla.kernel.org/show_bug.cgi?id=42703
https://bugzilla.kernel.org/show_bug.cgi?id=37732

The general trend seems to be the following:
- KVM is used
- ksmd is the current comm
- a GPF is triggered due to the upper 16 bits of a register used as an address being cleared

It would be extremely helpful to test with KSM disabled, to see if this is the root cause.

Revision history for this message
Chris J Arges (arges) wrote :
Revision history for this message
penalvch (penalvch) wrote :

Shyam, this bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue in Precise+? If so, could you please run the following command from a Terminal (Applications->Accessories->Terminal), as it will automatically gather and attach updated debug information to this report:

apport-collect -p linux <replace-with-bug-number>

tags: added: bios-outdated-1.12.0
Changed in linux (Ubuntu):
assignee: Chris J Arges (arges) → nobody
status: In Progress → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.