CONFIG_NR_CPUS=256 is too low

Bug #1579205 reported by Tim Wickberg
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Tim Gardner
Xenial
Fix Released
Undecided
Unassigned
Yakkety
Fix Released
Medium
Tim Gardner

Bug Description

Intel Knights Landing systems boot up to 272 processor cores. The stock Ubuntu Linux kernel images leave processor cores above 256 disabled:

[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 256/0xeb ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 257/0xef ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 258/0xf3 ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 259/0xf7 ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 260/0xfb ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 261/0xff ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 262/0x103 ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 263/0x107 ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 264/0x10b ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 265/0x10f ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 266/0x113 ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 267/0x117 ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 268/0x11b ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 269/0x11f ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 270/0x123 ignored.
[ 0.000000] ACPI: NR_CPUS/possible_cpus limit of 256 reached. Processor 271/0x127 ignored.

I'd request an increase in CONFIG_NR_CPUS to 512. This would match Debian's default setting. For comparison, RHEL/CentOS ship with 4096.

Brad Figg (brad-figg)
affects: linux-meta (Ubuntu) → linux (Ubuntu)
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key
Revision history for this message
Tim Gardner (timg-tpi) wrote :

The kernel I'm working on in unstable (git://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/unstable) will eventually become the Yakkety kernel. I've set NR_CPUS to 8K which is the default if you go above 512. I've already had a request to support a platform with over 1K cores. I'll leave it there unless we find through testing that it causes extraordinary RAM consumption or has some other deleterious side effect.

Changed in linux (Ubuntu Yakkety):
assignee: nobody → Tim Gardner (timg-tpi)
Revision history for this message
Danny Auble (da) wrote :

Thanks Tim!

Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Yakkety):
status: Confirmed → Fix Committed
Luis Henriques (henrix)
Changed in linux (Ubuntu Xenial):
status: New → Fix Committed
Revision history for this message
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-xenial' to 'verification-done-xenial'. If the problem still exists, change the tag 'verification-needed-xenial' to 'verification-failed-xenial'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-xenial
Revision history for this message
Tim Gardner (timg-tpi) wrote :

grep NR_CPU config-4.4.0-57-generic
CONFIG_NR_CPUS=512

tags: added: verification-done-xenial
removed: verification-needed-xenial
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (17.0 KiB)

This bug was fixed in the package linux - 4.4.0-57.78

---------------
linux (4.4.0-57.78) xenial; urgency=low

  * Release Tracking Bug
    - LP: #1648867

  * Miscellaneous Ubuntu changes
    - SAUCE: Do not build the xr-usb-serial driver for s390

linux (4.4.0-56.77) xenial; urgency=low

  * Release Tracking Bug
    - LP: #1648867

  * Release Tracking Bug
    - LP: #1648579

  * CONFIG_NR_CPUS=256 is too low (LP: #1579205)
    - [Config] Increase the NR_CPUS to 512 for amd64 to support systems with a
      large number of cores.

  * NVMe drives in Amazon AWS instance fail to initialize (LP: #1648449)
    - SAUCE: (no-up) NVMe: only setup MSIX once

linux (4.4.0-55.76) xenial; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1648503

  * NVMe driver accidentally reverted to use GSI instead of MSIX (LP: #1647887)
    - (fix) NVMe: restore code to always use MSI/MSI-x interrupts

linux (4.4.0-54.75) xenial; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1648017

  * Update hio driver to 2.1.0.28 (LP: #1646643)
    - SAUCE: hio: update to Huawei ES3000_V2 (2.1.0.28)

  * linux: Enable live patching for all supported architectures (LP: #1633577)
    - [Config] CONFIG_LIVEPATCH=y for s390x

  * Botched backport breaks level triggered EOIs in QEMU guests with --machine
    kernel_irqchip=split (LP: #1644394)
    - kvm/irqchip: kvm_arch_irq_routing_update renaming split

  * Xenial update to v4.4.35 stable release (LP: #1645453)
    - x86/cpu/AMD: Fix cpu_llc_id for AMD Fam17h systems
    - KVM: x86: fix missed SRCU usage in kvm_lapic_set_vapic_addr
    - KVM: Disable irq while unregistering user notifier
    - fuse: fix fuse_write_end() if zero bytes were copied
    - mfd: intel-lpss: Do not put device in reset state on suspend
    - can: bcm: fix warning in bcm_connect/proc_register
    - i2c: mux: fix up dependencies
    - kbuild: add -fno-PIE
    - scripts/has-stack-protector: add -fno-PIE
    - x86/kexec: add -fno-PIE
    - kbuild: Steal gcc's pie from the very beginning
    - ext4: sanity check the block and cluster size at mount time
    - crypto: caam - do not register AES-XTS mode on LP units
    - drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)
    - clk: mmp: pxa910: fix return value check in pxa910_clk_init()
    - clk: mmp: pxa168: fix return value check in pxa168_clk_init()
    - clk: mmp: mmp2: fix return value check in mmp2_clk_init()
    - rtc: omap: Fix selecting external osc
    - iwlwifi: pcie: fix SPLC structure parsing
    - mfd: core: Fix device reference leak in mfd_clone_cell
    - uwb: fix device reference leaks
    - PM / sleep: fix device reference leak in test_suspend
    - PM / sleep: don't suspend parent when async child suspend_{noirq, late}
      fails
    - IB/mlx4: Check gid_index return value
    - IB/mlx4: Fix create CQ error flow
    - IB/mlx5: Use cache line size to select CQE stride
    - IB/mlx5: Fix fatal error dispatching
    - IB/core: Avoid unsigned int overflow in sg_alloc_table
    - IB/uverbs: Fix leak of XRC target QPs
    - IB/cm: Mark stale CM id's whenever the mad agent was unregistered
    - netfilter: nft_dynset: fix element timeou...

Changed in linux (Ubuntu Xenial):
status: Fix Committed → Fix Released
status: Fix Committed → Fix Released
Revision history for this message
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-yakkety' to 'verification-done-yakkety'. If the problem still exists, change the tag 'verification-needed-yakkety' to 'verification-failed-yakkety'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-yakkety
Tim Gardner (timg-tpi)
tags: added: verification-done-yakkety
removed: verification-needed-yakkety
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.2 KiB)

This bug was fixed in the package linux - 4.8.0-34.36

---------------
linux (4.8.0-34.36) yakkety; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1651800

  * Miscellaneous Ubuntu changes
    - SAUCE: Do not build the xr-usb-serial driver for s390

linux (4.8.0-33.35) yakkety; urgency=low

  [ Thadeu Lima de Souza Cascardo ]

  * Release Tracking Bug
    - LP: #1651721

  [ Luis Henriques ]

  * crypto : tolerate new crypto hardware for z Systems (LP: #1644557)
    - s390/zcrypt: Introduce CEX6 toleration

  * Several new Asus laptops are missing touchpad support (LP: #1650895)
    - HID: asus: Add i2c touchpad support

  * Acer, Inc ID 5986:055a is useless after 14.04.2 installed. (LP: #1433906)
    - uvcvideo: uvc_scan_fallback() for webcams with broken chain

  * cdc_ether fills kernel log (LP: #1626371)
    - cdc_ether: Fix handling connection notification

  * Kernel Fixes to get TCMU File Backed Optical to work (LP: #1646204)
    - SAUCE: target/user: Fix use-after-free of tcmu_cmds if they are expired

  * CVE-2016-9756
    - KVM: x86: drop error recovery in em_jmp_far and em_ret_far

  * On boot excessive number of kworker threads are running (LP: #1649905)
    - slub: move synchronize_sched out of slab_mutex on shrink

  * Ethernet not work after upgrade from kernel 3.19 to 4.4 [10ec:8168]
    (LP: #1648279)
    - ACPI / blacklist: Make Dell Latitude 3350 ethernet work

  * Ubuntu 16.10 netboot install fails with "Oops: Exception in kernel mode,
    sig: 5 [#1] " (lpfc) (LP: #1648873)
    - scsi: lpfc: fix oops/BUG in lpfc_sli_ringtxcmpl_put()

  * CVE-2016-9793
    - net: avoid signed overflows for SO_{SND|RCV}BUFFORCE

  * [Hyper-V] Kernel panic not functional on 32bit Ubuntu 14.10, 15.04, and
    15.10 (LP: #1400319)
    - Drivers: hv: avoid vfree() on crash

  * d-i is missing usb support for platforms that use the xhci-platform driver
    (LP: #1625222)
    - d-i initrd needs additional usb modules to support the merlin platform

  * overlayfs no longer supports nested overlayfs mounts, but there is a fix
    upstream (LP: #1647007)
    - ovl: fix d_real() for stacked fs

  * Yakkety: arm64: CONFIG_ARM64_ERRATUM_845719 isn't enabled (LP: #1647793)
    - [Config] CONFIG_ARM64_ERRATUM_845719=y

  * Ubuntu16.10 - EEH on BELL3 adapter fails to recover (serial/tty)
    (LP: #1646857)
    - serial: 8250_pci: Detach low-level driver during PCI error recovery

  * Driver for Exar USB UART (LP: #1645591)
    - SAUCE: xr-usb-serial: Driver for Exar USB serial ports
    - SAUCE: xr-usb-serial: interface for switching modes
    - SAUCE: cdc-acm: Exclude Exar USB serial ports

  * [Bug] (Purley) x86/hpet: Reduce HPET counter read contention (LP: #1645928)
    - x86/hpet: Reduce HPET counter read contention

  * Need Alps upstream their new touchpad driver (LP: #1571530)
    - Input: ALPS - add touchstick support for SS5 hardware
    - Input: ALPS - handle 0-pressure 1F events
    - Input: ALPS - allow touchsticks to report pressure
    - Input: ALPS - set DualPoint flag for 74 03 28 devices

  * CONFIG_NR_CPUS=256 is too low (LP: #1579205)
    - [Config] Increase the NR_CPUS to 512 for amd64 to support systems with a...

Read more...

Changed in linux (Ubuntu Yakkety):
status: Fix Committed → Fix Released
Revision history for this message
Adam Conrad (adconrad) wrote : Update Released

The verification of the Stable Release Update for linux-aws has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Po-Hsu Lin (cypressyew)
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.