Wrong RAM size shown for server

Bug #1777131 reported by Alexander Litvinov
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Undecided
Unassigned
lshw (Ubuntu)
Fix Released
High
Unassigned
Xenial
Fix Released
Medium
Eric Desrochers

Bug Description

[IMPACT]
System with DIMM size of >32GB aren't reporting exact size value when using 'lshw'.

Giving wrong information to hardware inventory system which may collect data from 'lshw', any hardware certification vendor testing, wrong RAM amount shown in the MAAS UI, ...

[TEST CASE]

* Use system running Xenial with DIMM >32GB of RAM
 ** sudo lshw -C memory

Output:
...
*-bank:0
          description: <REDACTED>
          product: <REDACTED>
          vendor: <REDACTED>
          physical id: <REDACTED>
          serial: <REDACTED>
          slot: <REDACTED>
==> size: 31GiB
          ...
...

In this case, size should be 64GB, just like 'dmidecode' shows

[POTENTIAL REGRESSION]

Very low, the fix only take care of DIMM size handling to support DIMM gt 32GB.

A test package has been provided to an impacted user, here's what has been brought to my attention:

"
That seems to be working. Dumped a copy of the lshw output to two files, "before" and "after".

$ diff before after
38c38
< size: 31GiB
---
> size: 64GiB

and so on down the line for each memory chip.
"

[OTHER INFORMATIONS]

Upstream bug:
https://ezix.org/project/ticket/662

Upstream commit:
https://ezix.org/src/pkg/lshw/commit/640615983fbf976e66931164a9ae1bd64da9668b

I'm working on a backport fix for Xenial.

# git describe --contains 6406159
B.02.17~26

# rmadison
=> lshw | 02.17-1.1ubuntu3.5 | xenial-updates
   lshw | 02.18-0.1ubuntu6 | bionic
   lshw | 02.18-0.1ubuntu6.18.04.1 | bionic-updates
   lshw | 02.18-0.1ubuntu7 | cosmic
   lshw | 02.18-0.1ubuntu7 | disco
   lshw | 02.18.85-0.1ubuntu1 | eoan

[ORIGINAL DESCRIPTION]

Currently MAAS relies on DMI for the info about RAM size.
DMI seems not to be always correct, this results in wrong RAM amount shown in the MAAS UI.

In my case :

......
handle: DMI:0017
            - lshw:description:
              DIMM Synchronous 2666 MHz (0.4 ns)
            - lshw:product:
              M386A8K40BM2-CTD
            - lshw:vendor:
              Samsung
            - lshw:physid:
              0
            - lshw:serial:
              375610DE
            - lshw:slot:
              P1-DIMMA1
            - lshw:size:
              units: bytes
              34358689792
......

full machine yaml : https://pastebin.canonical.com/p/TqpvzXj2sx/

However product M386A8K40BM2-CTD is actually 64GB:
https://memory.net/product/m386a8k40bm2-ctd-samsung-1x-64gb-ddr4-2666-lrdimm-pc4-21300v-l-quad-rank-x4-module/

I have 12 of those, and on boot it shows me the correct amount 12 * 64GB:

ubuntu:~$ dmesg | grep Memory
[ 0.000000] Memory: 791161372K/803909324K available (8541K kernel code, 1313K rwdata, 4000K rodata, 1512K init, 1316K bss, 12747952K reserved, 0K cma-reserved)

ubuntu:~$ free -m
              total used free shared buff/cache available
Mem: 772658 3828 768156 18 672 766751
Swap: 8191 0 8191

----------------------------------------------------------------

/var/log/maas : https://private-fileshare.canonical.com/~dima/varlogmaas-15062018.tar

ubuntu$ dpkg -l '*maas*'|cat
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===============================-====================================-============-=============================================
un maas <none> <none> (no description available)
ii maas-cli 2.3.3-6498-ge4db91d-0ubuntu1~16.04.1 all MAAS client and command-line interface
un maas-cluster-controller <none> <none> (no description available)
ii maas-common 2.3.3-6498-ge4db91d-0ubuntu1~16.04.1 all MAAS server common files
ii maas-dhcp 2.3.3-6498-ge4db91d-0ubuntu1~16.04.1 all MAAS DHCP server
ii maas-dns 2.3.3-6498-ge4db91d-0ubuntu1~16.04.1 all MAAS DNS server
ii maas-proxy 2.3.3-6498-ge4db91d-0ubuntu1~16.04.1 all MAAS Caching Proxy
ii maas-rack-controller 2.3.3-6498-ge4db91d-0ubuntu1~16.04.1 all Rack Controller for MAAS
ii maas-region-api 2.3.3-6498-ge4db91d-0ubuntu1~16.04.1 all Region controller API service for MAAS
ii maas-region-controller 2.3.3-6498-ge4db91d-0ubuntu1~16.04.1 all Region Controller for MAAS
un maas-region-controller-min <none> <none> (no description available)
un python-django-maas <none> <none> (no description available)
un python-maas-client <none> <none> (no description available)
un python-maas-provisioningserver <none> <none> (no description available)
ii python3-django-maas 2.3.3-6498-ge4db91d-0ubuntu1~16.04.1 all MAAS server Django web framework (Python 3)
ii python3-maas-client 2.3.3-6498-ge4db91d-0ubuntu1~16.04.1 all MAAS python API client (Python 3)
ii python3-maas-provisioningserver 2.3.3-6498-ge4db91d-0ubuntu1~16.04.1 all MAAS server provisioning libraries (Python 3)

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi Alexander,

To confirm, what you are saying is that DMI shows the incorrect RAM size, and hence, since MAAS uses it to detect the RAM, MAAS shows the incorrect value?

Changed in maas:
status: New → Incomplete
description: updated
description: updated
description: updated
Revision history for this message
Alexander Litvinov (alitvinov) wrote :

Andres,

I think the bug is that MAAS shows wrong amount of RAM in the UI.
I'm seeing this onsite and our customer is not happy about this fact.

If it relies on DMI, then, perhaps there is some other way to detect that right?

tags: added: cpe-onsite
Revision history for this message
Andres Rodriguez (andreserl) wrote :

Alexander,

From the looks of the lshw it seems that it reports 12 dimms of ~34GB each. This means that the RAM size is being reported incorrectly by LSHW and, as such, MAAS reflects that information. As such, I'll mark this invalid for MAAS, and open a task for lshw.

That said, i noticed you are using 2.3.3, and I'm guessing you are using Xenial for commissioning. Could you please switch to bionic for commissioning and see if the issue persists? A newer version of lshw is in Bionic which may have already fixed this issue.

Changed in maas:
status: Incomplete → Invalid
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in lshw (Ubuntu):
status: New → Confirmed
Revision history for this message
Alexander Litvinov (alitvinov) wrote :

Commissioning with Bionic indeed works fine, gets RAM correctly:
https://pastebin.canonical.com/p/G7B7zVMkFF/

Ante Karamatić (ivoks)
Changed in lshw (Ubuntu):
importance: Undecided → High
Revision history for this message
Eric Desrochers (slashd) wrote :

Upstream bug : https://ezix.org/project/ticket/662
Upstream commit : https://ezix.org/src/pkg/lshw/commit/640615983fbf976e66931164a9ae1bd64da9668b

I'm working on a backport fix for Xenial.

# git describe --contains 6406159
B.02.17~26

# rmadison
=> lshw | 02.17-1.1ubuntu3.5 | xenial-updates
    lshw | 02.18-0.1ubuntu6 | bionic
    lshw | 02.18-0.1ubuntu6.18.04.1 | bionic-updates
    lshw | 02.18-0.1ubuntu7 | cosmic
    lshw | 02.18-0.1ubuntu7 | disco
    lshw | 02.18.85-0.1ubuntu1 | eoan

Changed in lshw (Ubuntu):
status: Confirmed → Fix Released
Changed in lshw (Ubuntu Xenial):
assignee: nobody → Eric Desrochers (slashd)
importance: Undecided → Medium
status: New → In Progress
tags: added: sts
Eric Desrochers (slashd)
description: updated
description: updated
description: updated
description: updated
Eric Desrochers (slashd)
description: updated
description: updated
Revision history for this message
Eric Desrochers (slashd) wrote :

I uploaded the package in the Xenial upload queue for SRU verification team approval. Once approved, the package will start building in xenial-proposed, entering the testing phase of the SRU.

Eric

Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Alexander, or anyone else affected,

Accepted lshw into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/lshw/02.17-1.1ubuntu3.6 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in lshw (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-xenial
Revision history for this message
Eric Desrochers (slashd) wrote :

[VERIFICATION XENIAL]

I can't reproduce the problem myself as I don't have such HW at my disposal, but the following have been brought to my attention (from an impacted users):

This was tested against the lswh found in xenial-proposed.

It is now displaying the right size.

..........
*-bank:0
description: <REDACTED>
product: <REDACTED>
vendor: <REDACTED>
physical id: 0
serial: <REDACTED>
slot: <REDACTED>
=> size: 64GiB
width: 72 bits
clock: 2400MHz (0.4ns)
..........

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lshw - 02.17-1.1ubuntu3.6

---------------
lshw (02.17-1.1ubuntu3.6) xenial; urgency=medium

  * d/p/lshw-reports-wrong-memory-size.patch:
    - Implement DMI memory reporting when DIMM
      size is >32GiB (LP: #1777131)

 -- Eric Desrochers <email address hidden> Mon, 29 Apr 2019 20:00:59 +0000

Changed in lshw (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for lshw has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.