Add support for 16g huge pages on Ubuntu 16.04.2 PowerNV

Bug #1706247 reported by bugproxy on 2017-07-25
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Medium
Canonical Kernel Team
linux (Ubuntu)
Medium
Joseph Salisbury
Zesty
Medium
Joseph Salisbury
Artful
Medium
Joseph Salisbury
Bionic
Medium
Joseph Salisbury

Bug Description

16G Huge Pages are not supported on PowerNV (bare metal) installations. Ubuntu 16.04.2 still allows 16G huge pages to be turned on.

Contact Information = Mike Hollinger (<email address hidden>)

---uname output---
Linux aprilmin4 4.4.0-83-generic #106-Ubuntu SMP Mon Jun 26 17:53:54 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = 8335-GTB

---Debugger---
A debugger is not configured

---Steps to Reproduce---
 1. Add the following to the kernel boot args (either via editing the petitboot boot option manually, or updating /boot/grub/grub.cfg:

default_hugepagesz=16G hugepagesz=16G hugepages=4

2. Boot Linux
3. Observe that the kernel believes 16G huge pages are available:
ubuntu@aprilmin7:~$ cat /proc/meminfo | grep -i huge
AnonHugePages: 475136 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 16777216 kB
ubuntu@aprilmin7:~$ ls /sys/devices/system/node/node0/hugepages
hugepages-1024kB hugepages-16384kB hugepages-16777216kB

Stack trace output:
 no

Oops output:
 no

System Dump Info:
  The system is not configured to capture a system dump.

*Additional Instructions for Mike Hollinger (<email address hidden>):
-Post a private note with access information to the machine that the bug is occuring on.
-Attach sysctl -a output output to the bug.

This may be recreated on any bare metal POWER8 server with Ubuntu 16.04.2; I haven't checked other versions of Ubuntu.

bugproxy (bugproxy) on 2017-07-25
tags: added: architecture-ppc64le bugnameltc-156931 severity-medium targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → kernel-package (Ubuntu)
Changed in ubuntu-power-systems:
importance: Undecided → Medium
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
affects: kernel-package (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
tags: added: kernel-da-key
Changed in ubuntu-power-systems:
status: New → Triaged
Manoj Iyer (manjo) on 2017-07-31
summary: - Possible to turn on 16g huge pages on Ubuntu 16.04.2 PowerNV
+ It should not be possible to turn on 16g huge pages on Ubuntu 16.04.2
+ PowerNV
tags: added: triage-g

------- Comment From <email address hidden> 2017-08-02 00:43 EDT-------
Cannonical, Any update here ?

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-08-04 02:45 EDT-------
Canonical, Can you fix those ?

16G Huge Pages are not supported on PowerNV (bare metal) installations. Ubuntu 16.04.2 still allows 16G huge pages to be turned on.

tags: added: kernel-key
removed: kernel-da-key
tags: added: kernel-da-key
removed: kernel-key

Do you happen to know if there is a boot or sysctl option that can be used to disable 16G Huge Pages?

------- Comment From <email address hidden> 2017-10-09 01:29 EDT-------
Aneesh, (from Canonical) Do you happen to know if there is a boot or sysctl option that can be used to disable 16G Huge Pages?

bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-10-26 01:24 EDT-------
We do support 16GB hugepage from hardware point of view. What we had missing was the ability to allocate such large contiguous memory in the kernel. The pseries platform solved this by taking help from hypervisor.
PAPR introduced a mechanism for hypervisor to mark 16G contiguous region reserved and pass the details via device tree. On Powernv this is not available and hence we are not able to make use of 16G hugepages.

I have added the ability to allocate 16G hugepages on powernv platform. The relevant fixes can be found at

commit 4ae279c2c96ab38a78b954d218790a8f6db714e5
Author: Aneesh Kumar K.V <email address hidden>
Date: Fri Jul 28 10:31:27 2017 +0530

powerpc/mm/hugetlb: Allow runtime allocation of 16G.

You may want that and other dependent kernel patches.

-aneesh

I built X, Z and A test kernels with commit 4ae279c2c96ab38a78b954d218790a8f6db714e5. The test kernels can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1706247/

Can you test these kernels and see if they resolve this bug?

Changed in linux (Ubuntu):
status: Triaged → In Progress
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Xenial):
status: New → In Progress
Changed in linux (Ubuntu Zesty):
status: New → In Progress
Changed in linux (Ubuntu Artful):
status: New → In Progress
Changed in linux (Ubuntu Xenial):
importance: Undecided → Medium
Changed in linux (Ubuntu Zesty):
importance: Undecided → Medium
Changed in linux (Ubuntu Artful):
importance: Undecided → Medium
Changed in linux (Ubuntu Xenial):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Zesty):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Artful):
assignee: nobody → Joseph Salisbury (jsalisbury)

------- Comment From <email address hidden> 2017-10-26 23:16 EDT-------
Hi,

There are other dependent patches needed. The full series that we need to support 16G on powernv include:

commit 4ae279c2c96ab38a78b954d218790a8f6db714e5
Author: Aneesh Kumar K.V <email address hidden>
Date: Fri Jul 28 10:31:27 2017 +0530

powerpc/mm/hugetlb: Allow runtime allocation of 16G.
commit 79cc38ded1e1ac86e69c90f604efadd50b0b3762
Author: Aneesh Kumar K.V <email address hidden>
Date: Fri Jul 28 10:31:26 2017 +0530

powerpc/mm/hugetlb: Add support for reserving gigantic huge pages via kernel command line
commit e24a1307ba1f99fc62a0bd61d5e87fcfb6d5503d
Author: Aneesh Kumar K.V <email address hidden>
Date: Fri Jul 28 10:31:25 2017 +0530

mm/hugetlb: Allow arch to override and call the weak function
ommit 40692eb5eea209c2dd55857f44b4e1d7206e91d6
Author: Aneesh Kumar K.V <email address hidden>
Date: Thu Jul 6 15:39:20 2017 -0700

powerpc/mm/hugetlb: add support for 1G huge pages
commit e1073d1e7920946ac4776a619cc40668b9e1401b
Author: Aneesh Kumar K.V <email address hidden>
Date: Thu Jul 6 15:39:17 2017 -0700

mm/hugetlb: clean up ARCH_HAS_GIGANTIC_PAGE

Changed in ubuntu-power-systems:
status: Triaged → In Progress

I built an Artful test kernel(17.10) with the following three commits:

4ae279c2 powerpc/mm/hugetlb: Allow runtime allocation of 16G.
79cc38de powerpc/mm/hugetlb: Add support for reserving gigantic huge pages via kernel command line
e24a1307 mm/hugetlb: Allow arch to override and call the weak function

Commits e1073d1e7 and 40692eb5 were not needed since there are already in Artful/4.13.

The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1706247/artful

Can you test this kernel and see if it resolves this bug?

I will work on building Zesty and Xenial test kernels next. They required those two extra commits and they need some back porting.

Changed in linux (Ubuntu):
status: In Progress → Incomplete
Changed in linux (Ubuntu Xenial):
status: In Progress → Incomplete
Changed in linux (Ubuntu Zesty):
status: In Progress → Incomplete
Changed in linux (Ubuntu Artful):
status: In Progress → Incomplete
Changed in ubuntu-power-systems:
status: In Progress → Incomplete

------- Comment From <email address hidden> 2017-12-14 03:49 EDT-------
Hi ,

I verified this on kernel available on http://kernel.ubuntu.com/~jsalisbury/lp1706247/artful/ and able to allocate 16GB huge pages number .

ubuntu@ltc-garri3:~$ cat /proc/cmdline
root=UUID=9db8977d-1eec-4245-b1d9-b4f2aa615a36 ro default_hugepagesz=16G hugepagesz=16G hugepages=4 quiet splash
ubuntu@ltc-garri3:~$ cat /proc/meminfo | grep -i huge
AnonHugePages: 0 kB
ShmemHugePages: 0 kB
HugePages_Total: 4
HugePages_Free: 4
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 16777216 kB
ubuntu@ltc-garri3:~$ uname -a
Linux ltc-garri3 4.13.0-16-generic #19~lp1706247 SMP Wed Nov 1 23:21:53 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux

ubuntu@ltc-garri3:~$ ls /sys/devices/system/node/node0/hugepages/
hugepages-16384kB hugepages-16777216kB
ubuntu@ltc-garri3:~$

Regards
Praveen

Changed in linux (Ubuntu Artful):
status: Incomplete → In Progress
Changed in linux (Ubuntu Zesty):
status: Incomplete → In Progress
Changed in linux (Ubuntu Xenial):
status: Incomplete → In Progress
Changed in linux (Ubuntu):
status: Incomplete → In Progress
summary: - It should not be possible to turn on 16g huge pages on Ubuntu 16.04.2
- PowerNV
+ Add support for 16g huge pages on Ubuntu 16.04.2 PowerNV
Joseph Salisbury (jsalisbury) wrote :

SRU request submitted for Artful:
https://lists.ubuntu.com/archives/kernel-team/2017-December/088932.html

The Xenial and Zesty SRU request will also be sent shortly.

Joseph Salisbury (jsalisbury) wrote :

I built a Zesty test kernel with the following six patches:

a525108cf1cc14651602d678da38fa627a76a724
e1073d1e7920946ac4776a619cc40668b9e1401b
40692eb5eea209c2dd55857f44b4e1d7206e91d6
e24a1307ba1f99fc62a0bd61d5e87fcfb6d5503d
79cc38ded1e1ac86e69c90f604efadd50b0b3762
4ae279c2c96ab38a78b954d218790a8f6db714e5

The test kernel can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1706247/zesty

Can you test this kernel and see if it resolves this bug?

bugproxy (bugproxy) on 2018-01-12
tags: added: targetmilestone-inin1704
removed: targetmilestone-inin---
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2018-01-12 06:04 EDT-------
(In reply to comment #29)
> I built a Zesty test kernel with the following six patches:
>
> a525108cf1cc14651602d678da38fa627a76a724
> e1073d1e7920946ac4776a619cc40668b9e1401b
> 40692eb5eea209c2dd55857f44b4e1d7206e91d6
> e24a1307ba1f99fc62a0bd61d5e87fcfb6d5503d
> 79cc38ded1e1ac86e69c90f604efadd50b0b3762
> 4ae279c2c96ab38a78b954d218790a8f6db714e5
>
> The test kernel can be downloaded from:
>
> http://kernel.ubuntu.com/~jsalisbury/lp1706247/zesty
>
> Can you test this kernel and see if it resolves this bug?

https://bugzilla.linux.ibm.com/show_bug.cgi?id=156931#c25

Praveen K. Pandey 2017-12-14 02:49:52 CST

Hi ,

I verified this on kernel available on http://kernel.ubuntu.com/~jsalisbury/lp1706247/artful/ and able to allocate 16GB huge pages number .

Stefan Bader (smb) on 2018-01-23
Changed in linux (Ubuntu Zesty):
status: In Progress → Won't Fix
no longer affects: linux (Ubuntu Xenial)
Changed in linux (Ubuntu Artful):
status: In Progress → Fix Committed
Manoj Iyer (manjo) on 2018-02-05
Changed in ubuntu-power-systems:
status: Incomplete → In Progress
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Released
Manoj Iyer (manjo) on 2018-03-05
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed
Stefan Bader (smb) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-artful' to 'verification-done-artful'. If the problem still exists, change the tag 'verification-needed-artful' to 'verification-failed-artful'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-artful
Launchpad Janitor (janitor) wrote :
Download full text (18.9 KiB)

This bug was fixed in the package linux - 4.13.0-38.43

---------------
linux (4.13.0-38.43) artful; urgency=medium

  * linux: 4.13.0-38.43 -proposed tracker (LP: #1755762)

  * Servers going OOM after updating kernel from 4.10 to 4.13 (LP: #1748408)
    - i40e: Fix memory leak related filter programming status
    - i40e: Add programming descriptors to cleaned_count

  * [SRU] Lenovo E41 Mic mute hotkey is not responding (LP: #1753347)
    - platform/x86: ideapad-laptop: Increase timeout to wait for EC answer

  * fails to dump with latest kpti fixes (LP: #1750021)
    - kdump: write correct address of mem_section into vmcoreinfo

  * headset mic can't be detected on two Dell machines (LP: #1748807)
    - ALSA: hda/realtek - Support headset mode for ALC215/ALC285/ALC289
    - ALSA: hda - Fix headset mic detection problem for two Dell machines
    - ALSA: hda - Fix a wrong FIXUP for alc289 on Dell machines

  * CIFS SMB2/SMB3 does not work for domain based DFS (LP: #1747572)
    - CIFS: make IPC a regular tcon
    - CIFS: use tcon_ipc instead of use_ipc parameter of SMB2_ioctl
    - CIFS: dump IPC tcon in debug proc file

  * i2c-thunderx: erroneous error message "unhandled state: 0" (LP: #1754076)
    - i2c: octeon: Prevent error message on bus error

  * hisi_sas: Add disk LED support (LP: #1752695)
    - scsi: hisi_sas: directly attached disk LED feature for v2 hw

  * EDAC, sb_edac: Backport 1 patch to Ubuntu 17.10 (Fix missing DIMM sysfs
    entries with KNL SNC2/SNC4 mode) (LP: #1743856)
    - EDAC, sb_edac: Fix missing DIMM sysfs entries with KNL SNC2/SNC4 mode

  * [regression] Colour banding and artefacts appear system-wide on an Asus
    Zenbook UX303LA with Intel HD 4400 graphics (LP: #1749420)
    - drm/edid: Add 6 bpc quirk for CPT panel in Asus UX303LA

  * DVB Card with SAA7146 chipset not working (LP: #1742316)
    - vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems

  * [Asus UX360UA] battery status in unity-panel is not changing when battery is
    being charged (LP: #1661876) // AC adapter status not detected on Asus
    ZenBook UX410UAK (LP: #1745032)
    - ACPI / battery: Add quirk for Asus UX360UA and UX410UAK

  * ASUS UX305LA - Battery state not detected correctly (LP: #1482390)
    - ACPI / battery: Add quirk for Asus GL502VSK and UX305LA

  * support thunderx2 vendor pmu events (LP: #1747523)
    - perf pmu: Extract function to get JSON alias map
    - perf pmu: Pass pmu as a parameter to get_cpuid_str()
    - perf tools arm64: Add support for get_cpuid_str function.
    - perf pmu: Add helper function is_pmu_core to detect PMU CORE devices
    - perf vendor events arm64: Add ThunderX2 implementation defined pmu core
      events
    - perf pmu: Add check for valid cpuid in perf_pmu__find_map()

  * lpfc.ko module doesn't work (LP: #1746970)
    - scsi: lpfc: Fix loop mode target discovery

  * Ubuntu 17.10 crashes on vmalloc.c (LP: #1739498)
    - powerpc/mm/book3s64: Make KERN_IO_START a variable
    - powerpc/mm/slb: Move comment next to the code it's referring to
    - powerpc/mm/hash64: Make vmalloc 56T on hash

  * ethtool -p fails to light NIC LED on HiSilicon D05 systems (LP: #1748567)
    - net...

Changed in linux (Ubuntu Artful):
status: Fix Committed → Fix Released
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.