[2.3, HWTv2] Memtester test is not robust
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Fix Released
|
Critical
|
Andres Rodriguez |
Bug Description
In machines with 32gb of ram, I noticed that memtester would always fail when running hardware testing. The errors I saw showed that MAAS could timeout at about 11 minutes, and the event log showed that the machines missed the past 5 heartbeats.
What I did is to try to run the tests manually on the same, and came across a couple of interesting things:
### This is the test that MAAS runs:
root@node04:
memtester version 4.3.0 (64-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 31289MB (32809152512 bytes)
got 31284MB (32803975168 bytes), trying mlock ...Killed
### I changed this to use MemAvailable instead, and the machine locked up.
root@node04:
memtester version 4.3.0 (64-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 31081MB (32591736832 bytes)
got 31081MB (32591736832 bytes), trying mlock ...locked.
Loop 1/1:
Stuck Address : setting 0
Related branches
- Lee Trager (community): Approve
-
Diff: 39 lines (+13/-6)2 files modifiedsrc/maasserver/static/partials/cards/storage.html (+1/-2)
src/metadataserver/builtin_scripts/memtester.sh (+12/-4)
Changed in maas: | |
milestone: | none → 2.3.0beta3 |
importance: | Undecided → Critical |
Changed in maas: | |
assignee: | nobody → Andres Rodriguez (andreserl) |
Changed in maas: | |
status: | Confirmed → In Progress |
Changed in maas: | |
status: | In Progress → Fix Committed |
Changed in maas: | |
status: | Fix Committed → Fix Released |
$(awk '/MemFree/ { print ($2 - 32768) "K"}' gave me: 31787072K
I re-run the test manually with : sudo -n memtester 29000000K 1
And this time it didn't lock up