Ubuntu

memtest86+ test #7 false positives (random number sequence error)

Reported by Nizamov Shawkat on 2012-10-25
334
This bug affects 69 people
Affects Status Importance Assigned to Milestone
Memtest86+
Unknown
Unknown
Release Notes for Ubuntu
Undecided
Unassigned
memtest86+ (Ubuntu)
Medium
Chris J Arges
Quantal
Medium
Chris J Arges
Raring
Medium
Chris J Arges
memtest86+ (openSUSE)
Fix Released
Medium

Bug Description

Trying to check the memory at newly bought notebook I found a bug in the memtest86+ version 4.20 in ubuntu 12.10, in ubuntu 12.04 it is ok. The bug is reported at least in Fedora and Opensuse. It is assumed that the bug is caused by the gcc-4.7.

It is easily reproducible - select the test #7 in memtest or just wait till it - starting from the 129Mb it will report a lot of errors. I have checked it on three different systems , two of them I use on a daily basis and would note if RAM is really bad.

https://bugzilla.redhat.com/show_bug.cgi?id=805813

http://lists.opensuse.org/opensuse-bugs/2012-09/msg04386.html

--

SRU Justification

[Impact]
Users of memtest86+ will get false positives of memory failures. This will cause users to suspect hardware and require unnecessary testing/headaches.

[Test Case]
Boot Ubuntu in Quantal/Raring. At GRUB select memtest86+. Wait until the 7th test. It will fail at the 7th test.

[Regression Potential]
The fix is just adding another register to clobber in the inline assembly routine. Because this affects newer GCC versions, older releases aren't affected. Thus, if there are compiler changes we should re-test.

User-Agent: Mozilla/5.0 (Windows NT 5.1) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.57 Safari/536.11

I was building a new test system described in my other bug over at
https://bugzilla.novell.com/show_bug.cgi?id=773565

smolts is:
http://www.smolts.org/client/show/pub_73920daf-1fba-42ec-be09-adb113ce053d

and I first thought I had serious hardware trouble as the memtest test number 7 resulted in endless errors. Changing ram modules, using only single modules or completely different ones for example different in speed and latencies always resulted in the same test number 7 failing seriously

I was using the memtest+ 4.20 from the 12.2/rc1/x86 image written to a usb key / usb stick and booted from that stick on that uefi system from above.

Eventually I used my older 12.1/x86 dvd iso booting via dvd rom drive and dvd rom, also having memtest+ version 4.20 and this now runs flawlessly on this same uefi machine, same configuration, only booting the memtest entry from the initial boot menu at pre-installation.

This machine has uefi, but maybe there is a problem with booting the later generation iso image from usb keys / usb sticks in general at suse or maybe this is slightly a different memtest+ 4.20 and not really identical on 12.1 and 12.2 iso images?

I am not really sure what to make of this behavior and where to pinpoint the source for this bug.

maybe related bugs:
http://bugzilla.novell.com/show_bug.cgi?id=773565
http://bugzilla.novell.com/show_bug.cgi?id=753574

maybe there is even some relation to:
http://bugzilla.novell.com/show_bug.cgi?id=771552

Thanks and regards.

Reproducible: Always

Steps to Reproduce:
1.
2.
3.

I ran into this yesterday after experiencing some system errors with no explanation. This system is a Intel motherboard with 8GB RAM (DDR3 - 4 2GB modules.

I rotated the modules, removed so4 GB, changed slots for the pairs, etc. Always reports error at 0x08100000 even with 1 2GB module installed. This always occurs at start of test 7.

Since this memory has worked for four years, and never had errors. I think its a test 7 problem. Also both 12.2 RC2 and 12.1 boot and work.

openSUSE 12.2(3.4.6-1.1-desktop x86_64)|KDE 4.8.4
"release 2"|Intel core2duo 2.5 MHZ,|8GB
DDR3|GeForce 8400GS(NVIDIA-Linux-x86_64-295.71)

Ops just noticed the fact it is occuring from thumb drive version. Mine is doing it from the iso written to DVD, memtest option on boot.

Btw, update, at that timeframe when I originally reported this bug, I actually found some pre-release beta1 it was I think version of memtestplus on those original developers folks forum site, there was a beta forum there, and I think it was beta1 of 5.0.0 or so

I put that onto a usb stick and booted it and that ran flawlessly on this system.

This version 5.0.0 has different tests though and more extensive hardware support and detection. it didnt report errors on this same hardware in any of its tests.

I was scared that my system was defective so I used that memtest+ 5 to verify this rather new hardware as I had thought it was a compatibility problem with 4.20, although it was odd.

Its still weird that the newer 4.20 of 12.2 reports problems but the older 4.20 from the 12.1 doesnt have trouble

http://forum.memtest.org/
redirects to
http://forum.canardpc.com/forums/73-Memtest86-Official-forum
then there is thread
http://forum.canardpc.com/threads/68001-Memtest86-5.00-Beta-available-%21-Need-betatesters-%21

it can do 64bit support, and multicpucore tests and so on
greetings.

Thanks for the update. I will try the 12.1 disc version and also download the newer version. I have a thumb drive Ill try to install it on. Never create a bootable thumb drive , will learn.

Thanks.

With 176760 byte 4.20-7.1.2 this also happens on Kairo's Sandy Bridge Intel, and a K10 @2813MHz loaded via Grub Legacy from HD. The Sandy Bridge works fine with the memtest.org 4.20. The K10 works fine on the 180856 byte 2011/05/13 4.20 which I suppose came from 11.4.

I've the same problem with 12.2/x86 final iso image from usb key.

(In reply to comment #6)
> I've the same problem with 12.2/x86 final iso image from usb key.

me too:-(

I just wanted to test a PC with memtest using the official dual-sided 12.2 DVD.
both 32bit and 64bit throw zillions errors for test # (random number sequence) starting at 129MB (PC has 2*2GB DIMMs).

using the memtest on official 12.1 DVD works fine -- no errors at all, so the hw is ok and 12.2 media is broken. pitty:-((

qemu-kvm -m 512 -kernel memtest-12.2-DVD
nicely shows the problem. Anyone who still sees a dependency on the source medium type (USB vs. DVD) please speak up now or remain silent.

There were _no_ source code changes AFAIK between 12.1 and 12.2 so we're most likely facing a compiler bug here.

Maybe I can be more precise before handing this over to the gcc maintainers.

Created an attachment (id=508246)
source file test.c form memtest86+-4.20

discriminating file.
compile with
gcc -S -Wall -march=i486 -m32 -Os -fomit-frame-pointer -fno-builtin -ffreestanding test.c

Created an attachment (id=508248)
working assembler file

Built with
gcc (SUSE Linux) 4.6.2
from 12.1

Created an attachment (id=508250)
broken assembler file

Built with
gcc (SUSE Linux) 4.7.1 20120723 [gcc-4_7-branch revision 189773]
on Factory/head

Injecting the broken / working compiler output into the other build environment fixes or breaks it, respectively.

<sys/io.h> and <inttypes.h> differ slightly; the test.i files are otherwise
identical.

Created an attachment (id=508256)
reverse patch, from working to broken

This patch, applied reverse, fixes a broken build.
Auto label numbers raised by 9900, to avoid clashes.

The bug is triggered by a different register allocation.
gcc-4.6 uses ebp for the volatile ulong *start (remember, -fomit-frame-pointer), where gcc-4.7 prefers ecx.

ECX is, AFAIK "caller-save" by the ABI calling convention; and the asm inline calls rand(), which it does not declare. Rand() is free to clobber ecx.

Created an attachment (id=508365)
Declare that the asm snippet clobbers ecx.

Suggested fix.

Applied.

This is an autogenerated message for OBS integration:
This bug (773569) was mentioned in
https://build.opensuse.org/request/show/137248 Factory / memtest86+

I hope steps are being made to contribute this fix back to the upstream. ;-)

That said, thanks for figuring this out!

Will there be update for 12.2? I understand that DVD won't change, but users still have it also installed as bootloader menu.

I really hope you make an update and make a new ISO!
I bought a new RAM and a new board because of this damn bug.

in 12.3/x64/milestone0 (iso image) the memtest+ still immediately fails when directly selecting test #7

has this fix not made it into 12.3/milestone0 yet?
do we need a new bug for 12.3?
thanks.

*** Bug 784206 has been marked as a duplicate of this bug. ***

It's unlikely that fixed ISOs for 12.2 are published, therefore the next-best solution is an entry in the release notes. I just requested this - see bug 784757.

This bug is also mentioned on the "most annoying bugs" page in the wiki, but I doubt too many people read it...

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in memtest86+ (Ubuntu):
status: New → Confirmed

Just a quick note: the bug report for FC contains also a patch

https://bugzilla.redhat.com/show_bug.cgi?id=805813

ethana2 (ethana2) wrote :

I have been modifying my RAM voltages, multipliers, and timings for the last three hours trying to pass memtest 4.2 on my fresh 64-bit 12.10 install machine. I was about to start tweaking the FSB. *headdesk*

Vincent (vincent-) on 2012-10-31
tags: added: memtest86+
tags: added: number random sequence
removed: memtest86+
tags: added: random-number-sequence
removed: number random sequence
summary: - memtest86 test #7 fails (random pattern error)
+ memtest86 test #7 fails (random number sequence error)

This bug also affects me. Tested on two X1 carbons, i5-3427U w/ 8gb and an i5-3317U w/ 4gb. Both fail at the 42~43% mark with a stream of errors.

Booting from xubuntu 12.10 amd64 usb (which has memtest 4.20) it fails.

I also tried from an xubuntu 12.04 amd64 usb (which also has memtest 4.20) and it passes.

Wolf (w-vollprecht) on 2012-11-04
Changed in memtest86+ (Ubuntu):
status: Confirmed → In Progress
status: In Progress → Confirmed
Dan Cundiff (pmotch) wrote :

Ran into this bug today as well. I was testing some memory on an older PC. Here's my specs FWIW:
* EVGA 132-CK-NF78-A1 LGA 775 NVIDIA nForce 780i SLI ATX Intel Motherboard
* Crucial Ballistix 2GB (2 x 1GB) 240-Pin DDR2 SDRAM DDR2 800 (PC2 6400) Dual Channel Kit Desktop Memory Model BL2KIT12864AA804
* Intel Q6600 processor

I tried many different combinations and encountered the error each time:
* All four sticks
* 2 sticks
* 1 stick
* Different timings and voltages
* Different clocking
* Different SLI profiles

I even went and bought new ram, different brand (CORSAIR XMS2 2GB (2 x 1GB) 240-Pin DDR2 SDRAM DDR2 800 (PC2 6400) Dual Channel Kit Desktop Memory Model TWIN2X2048-6400), that was approved compatible with my motherboard. Same issue. Then I found this bug report.

I tried memtest+ from an older 12.04 DVD I had around and there wasn't an error. Shook my fist in the air very rapidly.

tags: added: memtest86
ofb (cottlestonpie) wrote :

Me too. Xubutnu LiveCD 12.10. Whereas the memtest86 4.20 on Hiren's (DOS) works fine. Got me on Asus boards P4R800-VM, P4S533-X, and P4P800SE.

(Yeah, I was kinda deep before I figured it out. I was running two different test disks at once, so took a while to realize it wasn't the sockets or the RAM I was rotating through.)

This really needs to be on the 12.10 Release Notes by now. We've got a Red Hat report of it as early as March.

So, who updates the Release Notes? Do they get information get passed up from here, or is there a way I should send them the heads-up directly?

Thanks.

Changed in memtest86+ (openSUSE):
importance: Unknown → Medium
status: Unknown → Fix Released
tags: added: quantal
summary: - memtest86 test #7 fails (random number sequence error)
+ memtest86 test #7 false positives (random number sequence error)
Chris J Arges (arges) on 2012-12-04
Changed in memtest86+ (Ubuntu):
assignee: nobody → Chris J Arges (christopherarges)
importance: Undecided → Medium
Chris J Arges (arges) on 2012-12-04
Changed in memtest86+ (Ubuntu Quantal):
assignee: nobody → Chris J Arges (christopherarges)
importance: Undecided → Medium
status: New → In Progress
Changed in memtest86+ (Ubuntu Raring):
status: Confirmed → In Progress

Linked a branch with a fix that works for me in raring.

There seems to be a couple approaches for fixing this. The Suse bug has the small change to clobber the additional register and that seems to work. While the RH bug uses the C code in test.c instead of the optimized assembler to generate the proper binaries. I took the Suse approach for fixing this bug.

I noticed that there are uncommitted changes in the bzr development branch. So adding quilt didn't work so well.
To build I had to do an additional dpkg-source --commit to add these changes to a patch before building.
What's the proper way to do this so the package is in proper shape for patching?

You can test the raring version here:
http://people.canonical.com/~arges/lp1071209/

Chris J Arges (arges) wrote :

Also I couldn't find a similar bug in debian. So this is something that needs to be tested.

Chris J Arges (arges) wrote :

Fixed the bzr branch to just change the source directly.

Marc Deslauriers (mdeslaur) wrote :

Looks good, I've uploaded it to raring. Thanks!

Changed in memtest86+ (Ubuntu Raring):
status: In Progress → Fix Committed
Chris J Arges (arges) wrote :

Branch for quantal linked.

Chris J Arges (arges) on 2012-12-05
description: updated
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package memtest86+ - 4.20-1.1ubuntu3

---------------
memtest86+ (4.20-1.1ubuntu3) raring; urgency=low

  * Fix test#7 false positives due to compiler issues. (LP: #1071209)
    Original patch author Torsten Duwe <email address hidden>.
 -- Chris J Arges <email address hidden> Tue, 04 Dec 2012 16:23:35 -0600

Changed in memtest86+ (Ubuntu Raring):
status: Fix Committed → Fix Released
Sebastien Bacher (seb128) wrote :

sponsored the quantal version, thanks!

Lynara Le'dominae (lynarasys) wrote :

This bug also affects me. Bought two new sticks of RAM, and was suggested I test the processor to see if it was the issue... Glad I found this out before I went any further.

Gateway ID49CU
Ubuntu 12.10

summary: - memtest86 test #7 false positives (random number sequence error)
+ memtest86+ test #7 false positives (random number sequence error)
Lars Ola Liavåg (l-liavag) wrote :

I git this bug too. Tried to figure out problems with an old computer gone instable and found memtest86+ claiming that all three RAM sticks were faulty. Then tried some good (I thought) sticks from another computer with the same result. Then tried all sticks from both computers in the second computer. Then went to my problem-free main system and found that all of my brand new memory in that computer was also supposed to be bad.

Having all the RAM memory in the house in every computer I own all of a sudden gone rotten on test #7 didn't seem right at all, and I've been searching for days before I found this thread. Slipped in a 12.04 and ran memtest86+ from there and voila: no problems anymore - at least not with the sticks I know to be all right.

Now, I can go back to the actual work of fixing the computer that actually has problems. But I hope this bug is fixed by the release of 13.04 because to me, this has been a real time-waster.

Hello Nizamov, or anyone else affected,

Accepted memtest86+ into quantal-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/memtest86+/4.20-1.1ubuntu2.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in memtest86+ (Ubuntu Quantal):
status: In Progress → Fix Committed
tags: added: verification-needed

Hello,

I have tried the package from quantal-proposed on the same notebook and PC that suffered from this bug. Now they both pass memtest without problems. Hope it helps,

Chris J Arges (arges) wrote :

@nizamov-shawkat

That does help thanks!

tags: added: verification-done
removed: verification-needed
Itaru Kitayama (itaru) wrote :

@Brian

After installing the updated memtest86+ on to my Quantal box, I no loger see those false errors in test #7.
(Went through 11 or more times)

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package memtest86+ - 4.20-1.1ubuntu2.1

---------------
memtest86+ (4.20-1.1ubuntu2.1) quantal-proposed; urgency=low

  * Fix test#7 false positives due to compiler issues. (LP: #1071209)
    Original patch author Torsten Duwe <email address hidden>.
 -- Chris J Arges <email address hidden> Wed, 05 Dec 2012 12:31:52 -0600

Changed in memtest86+ (Ubuntu Quantal):
status: Fix Committed → Fix Released

Just confirming this seems fixed for the 12.3 DVDs.

I just checked this on the 12.3 build 0347 DVD. (The latest milestone as of Jan 19, 2013).

I had no issues. The same laptop with a 12.2 DVD shows the memtest86+ failure.

Changed in ubuntu-release-notes:
status: New → Incomplete

*** Bug 803806 has been marked as a duplicate of this bug. ***

solik (jankkhvej) wrote :

Downloaded Ubuntu 12.10 i386 ISO, converted it to OS X dmg and "burned" it to the USB stick.
Boot from this USB stick.
Begin memory test.
In every DIMM combination and on various chipsets test #7 fails around 129M.
Chipsets was Intel P45 and Intel Q35, memory was DDR2 400, DDR2 333 running on 800 and 666 MHz.

Downloaded memtest86+ 4.20 from website – no fail.

Basically, latest stable Ubuntu ISO distributing with buggy memory test utility.

abssorb (abssorb) wrote :

Users are still exposed to this bug.

That the bug is fixed in the repo isn't going to help people as memtest86 is run from a liveCD or Live USB.

The buggy version is still in the download for 12.10, even though it's fixed in the repo. It's also in the 12.10 secure remix.
Confirmed: I downloaded 12.10 64 bit today. Memtest86+ is v4.20. The Test #7 fails with good memory- false positive.

Recommend the downloads should also contain the fix and perhaps amend fix status to be more helpful?

People are going to get a fresh download, create a bootable CD or LiveUSB directly from the download without involving the repo - and get the false positives.

I have a problem with a machine and I've spent many hours running tests - I'd read about this bug, saw the the "fix released" status and initially eliminated it as a problem in my investigations. I'm sure others will do this also and assume fix released means it's in the latest downloads.

Even if you use the principle of RTFM, a false status against a false positive will be a headache to work around.

My impact is I have a new machine and I made the supplier send me 4x 8Gb RAM replacements which weren't actually needed, plus 26 hours testing time wasted. I would spare others that pain if I could.

With my MB / RAM config I'm getting the false positive with every ram stick in any slot, so if I can help with further testing I'd be happy to.

thedanyes (thedanyes) wrote :

I also encountered this bug while testing my PC with a burned 12.10 x64 disc. It was an older copy so I can't say I know whether this still affects ISOs downloaded today, but it's a pretty nasty bug. Probably wasted a good 3 hours of troubleshooting time for me, over a period of a week.

My hardware configuration: AMD A4-3800 on A75 chipset and tried with various DDR3 modules (Kingston, G. Skill, etc). Fails at 129MB or so on test 7, as previously reported.

dnel (dave.nelson) wrote :

I'm also seeing this bug off a burned copy of Ubuntu 12.10 i386 version. Fails on test #7 at 129MB on all 3 modules in any of the motherboard's slot.

Pete Graner (pgraner) on 2013-04-22
Changed in ubuntu-release-notes:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.