Lots of bad page table traces when running checkbox memory test on Lucid VM on XenServer

Bug #837573 reported by Jeff Lane 
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Installed Ubuntu 10.04.3 LTS 32bit on a VM running on Citrix XenServer 6.0 RC1. When checkbox-certification runs the memory test, I have noticed that there are several traces dumped by the kernel. These usually say things like "bad page table"

I get this every time I perform the following steps:

Install 32bit 10.04.3 Server on a VM on XenServer
Install checkbox, checkbox-compatibility and checkbox-certification
Run checkbox-certification
See traces in syslog when threaded_memtest is being run.

This does NOT occur when running 64bit 10.04.3 Server on a VM on the same XenServer host. AFAICT, this appears to only affect a 32bit DomU on XenServer 6.0.

ubuntu@ubuntu:/usr/share/checkbox-certification/scripts$ file threaded_memtest
threaded_memtest: ELF 32bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses sharedlibs), for GNU/Linux 2.6.15, stripped

ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: linux-image-2.6.32-33-generic-pae 2.6.32-33.70
Regression: No
Reproducible: Yes
ProcVersionSignature: Ubuntu 2.6.32-33.70-generic-pae 2.6.32.41+drm33.18
Uname: Linux 2.6.32-33-generic-pae i686
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access /dev/snd/: No such file or directory
AplayDevices: Error: [Errno 2] No such file or directory
Architecture: i386
ArecordDevices: Error: [Errno 2] No such file or directory
CurrentDmesg: [ 31.492109] eth0: no IPv6 routers present
Date: Tue Aug 30 13:04:51 2011
InstallationMedia: Ubuntu-Server 10.04.3 LTS "Lucid Lynx" - Release i386 (20110719.2)
Lsusb:
 Bus 001 Device 002: ID 0627:0001 Adomax Technology Co., Ltd
 Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: Xen HVM domU
PciMultimedia:

ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.32-33-generic-pae root=UUID=c93e4a06-8446-4926-8ddb-dc4056bb789b ro quiet
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: linux
dmi.bios.date: 08/11/2011
dmi.bios.vendor: Xen
dmi.bios.version: 4.1.1
dmi.chassis.type: 1
dmi.chassis.vendor: Xen
dmi.modalias: dmi:bvnXen:bvr4.1.1:bd08/11/2011:svnXen:pnHVMdomU:pvr4.1.1:cvnXen:ct1:cvr:
dmi.product.name: HVM domU
dmi.product.version: 4.1.1
dmi.sys.vendor: Xen

Revision history for this message
Jeff Lane  (bladernr) wrote :
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Jeff Lane  (bladernr) wrote :

here's syslog pulled from that DomU. I didn't see it attached, but you can see the traces contained therein.

Revision history for this message
Stefan Bader (smb) wrote :

Would you have access to the dom0 of the XenServer? If yes, it would help to have the output of "xm log" or "xm dmesg" (or xl dmesg, if its using that stack).
Also it might be worth trying to compare with an older xen hypervisor than 4.1.1. I am a bit busy right now, but I could try with 3.4.x.
Has that worked before (previous versions of the Lucid kernel and/or other XenServer versions)?

Revision history for this message
Jeff Lane  (bladernr) wrote :

Marking this private, as it should have been initially since this is filed against a pre-release version of XenServer. No harm as Citrix wants people certifying against this now, but just to err on the side of caution, I'm hiding it.

Stefan, I do have access to Dom0. It's actually sitting on a 1U server next to my left leg at the moment :-) I do NOT have access to any older hypervisors though. This is using XenServer, Citrix's stand-alone Xen stack, not the free Xen stack that sits on top of $LINUX_OS.

Interestingly, I tried recreating this manually by running the threaded_memtest binary from checkbox-certification 0.10 alone. I could not recreate it this way. The only way I could recreate it was to run checkbox-certification normally, as a startup service, which is how we run it on servers for certification purposes.

That being said, here's the output of xl dmesg.

Revision history for this message
Stefan Bader (smb) wrote :

Jeff, first, sorry for not getting back sooner. This has been completely wiped out by other things, so I failed to spend any more time on it since then. I have put it back on my things to do and hopefully I get to do it. Meanwhile, I cannot say this is really related but it is one of those strange issues, too: in bug #854050 upstream found an issue which would go back to 2.6.31 where heavy use and removed MMU flushes can lead to a crash. This is not the symptom here but I wonder whether there may be multiple manifestations.

Revision history for this message
Stefan Bader (smb) wrote :

Quick look at the dom0 log, I do not see anything that could be related to the domU problems.

Revision history for this message
Stefan Bader (smb) wrote :

Could someone put in a pointer on how to actually get checkbox-certification (or send me some pm)? Thanks

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.