Comment 12 for bug 1019863

Revision history for this message
fpgahardwareengineer (mypersonalmailbox1) wrote :

Hi Christopher,

I don't have access to this mainboard for 2 more weeks due to travel, but before I departed, I tested the board with a different set of PC133 SDRAM DIMMs.
Interestingly, it did not cause a freeze with these DIMM modules.
I put my computer into standby about 10 times, and the computer came out of standby 10 times in a row.
I ran Firefox on the background to put more "stress" on the system.
Based on my judgement, it seems like with the previous DIMM modules, the DRAM is getting corrupted after coming out of ACPI S3 State.
It seems like the slow refresh mode is not working properly on at least one of the previous DIMM module.
By the way, with the previous DIMM modules, they do pass Memtest86+ without any errors, but I have learned from my experience troubleshooting PC related problems so often that some bad DIMM modules do cause memory cell corruption if the DIMM modules enter slow refresh mode.
The memory cell corruption issue manifests itself after waking up the computer after the contents of the memory is corrupted.
For the previous DIMM modules used, it was something like this,

- PNY PC133 256MB module (DS)
- generic PC133 128MB module (SS, Micron Technology DRAM chip)
- generic PC133 128MB module (SS)

"generic" DIMM module means it is not obvious from the DIMM that who manufactured the DIMM.
For the configuration that didn't cause any issues, this was the configuration.

- Toshiba PC133 256MB module (DS, Toshiba DRAM chip)
- Toshiba PC133 256MB module (DS, Toshiba DRAM chip)

SS means "Single Sided."
DS means "Double Sided."
I do have more PC133 SDRAM DIMMs so I will test the mainboard with more DIMMs to see if this bug is specific to one bad module.
I will update the above list when I get back.
At this point, it is increasingly likely that a bad DIMM this mysterious bug.
If this is the case, I am very sorry for wasting the resources of the Linux developers, but unfortunately, it is often difficult to catch slow refresh related memory cell corruption hardware failures.
Memtest86+ does not have the means to catch this kind of problem, and I have seen this type of bug with a DDR SDRAM DIMM a few months ago.

Regards,

fpgahardwareengineer