[2.3, HWTv2] Memtester test is not robust

Bug #1722848 reported by Andres Rodriguez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Critical
Andres Rodriguez

Bug Description

In machines with 32gb of ram, I noticed that memtester would always fail when running hardware testing. The errors I saw showed that MAAS could timeout at about 11 minutes, and the event log showed that the machines missed the past 5 heartbeats.

What I did is to try to run the tests manually on the same, and came across a couple of interesting things:

### This is the test that MAAS runs:

root@node04:/home/ubuntu# memtester $(awk '/MemFree/ { print ($2 - 32768) "K"}' /proc/meminfo) 1
memtester version 4.3.0 (64-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 31289MB (32809152512 bytes)
got 31284MB (32803975168 bytes), trying mlock ...Killed

### I changed this to use MemAvailable instead, and the machine locked up.

root@node04:/home/ubuntu# memtester $(awk '/MemAvailable/ { print ($2 - 32768) "K"}' /proc/meminfo) 1
memtester version 4.3.0 (64-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 31081MB (32591736832 bytes)
got 31081MB (32591736832 bytes), trying mlock ...locked.
Loop 1/1:
  Stuck Address : setting 0

Related branches

Changed in maas:
milestone: none → 2.3.0beta3
importance: Undecided → Critical
Revision history for this message
Andres Rodriguez (andreserl) wrote :

$(awk '/MemFree/ { print ($2 - 32768) "K"}' gave me: 31787072K

I re-run the test manually with : sudo -n memtester 29000000K 1

And this time it didn't lock up

Changed in maas:
status: New → Confirmed
Changed in maas:
assignee: nobody → Andres Rodriguez (andreserl)
Changed in maas:
status: Confirmed → In Progress
Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.