acpi_pad consumes 100% of resources

Bug #1639299 reported by Newton Liu
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Nvidia
New
Undecided
Unassigned
linux (Ubuntu)
Incomplete
Undecided
Unassigned

Bug Description

acpi_pad will take up 100% of the CPU resources and slow the system to a crawl. 'rmmod acpi_pad' removes the offender and brings the system response back.

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
20765 root 20 0 0 0 0 R 100.0 0.0 5:07.99 xhpl
20879 root -2 0 0 0 0 R 100.0 0.0 7:12.40 acpi_pad/5
20887 root -2 0 0 0 0 R 100.0 0.0 6:57.72 acpi_pad/13
20891 root -2 0 0 0 0 R 100.0 0.0 7:05.74 acpi_pad/17
20874 root -2 0 0 0 0 R 100.0 0.0 7:15.16 acpi_pad/0
20875 root -2 0 0 0 0 R 100.0 0.0 7:14.76 acpi_pad/1
20876 root -2 0 0 0 0 R 100.0 0.0 7:13.54 acpi_pad/2
20877 root -2 0 0 0 0 R 100.0 0.0 7:13.54 acpi_pad/3
20880 root -2 0 0 0 0 R 100.0 0.0 7:11.44 acpi_pad/6
20881 root -2 0 0 0 0 R 100.0 0.0 7:11.17 acpi_pad/7
20882 root -2 0 0 0 0 R 100.0 0.0 7:05.42 acpi_pad/8
20883 root -2 0 0 0 0 R 100.0 0.0 7:10.80 acpi_pad/9
20884 root -2 0 0 0 0 R 100.0 0.0 7:09.50 acpi_pad/10
20885 root -2 0 0 0 0 R 100.0 0.0 7:09.66 acpi_pad/11
20888 root -2 0 0 0 0 R 100.0 0.0 7:07.30 acpi_pad/14
20889 root -2 0 0 0 0 R 100.0 0.0 7:07.37 acpi_pad/15
20890 root -2 0 0 0 0 R 100.0 0.0 7:05.50 acpi_pad/16
20892 root -2 0 0 0 0 R 100.0 0.0 7:04.40 acpi_pad/18
20893 root -2 0 0 0 0 R 100.0 0.0 7:04.21 acpi_pad/19
20894 root -2 0 0 0 0 R 100.0 0.0 7:03.70 acpi_pad/20
20895 root -2 0 0 0 0 R 100.0 0.0 7:03.63 acpi_pad/21
20896 root -2 0 0 0 0 R 100.0 0.0 7:01.61 acpi_pad/22
20897 root -2 0 0 0 0 R 100.0 0.0 7:01.66 acpi_pad/23
20898 root -2 0 0 0 0 R 100.0 0.0 7:00.80 acpi_pad/24
20899 root -2 0 0 0 0 R 100.0 0.0 7:00.81 acpi_pad/25
20901 root -2 0 0 0 0 R 100.0 0.0 6:58.79 acpi_pad/26
20902 root -2 0 0 0 0 R 100.0 0.0 6:58.96 acpi_pad/27
20903 root -2 0 0 0 0 R 100.0 0.0 6:57.82 acpi_pad/28
20904 root -2 0 0 0 0 R 100.0 0.0 6:57.83 acpi_pad/29
20906 root -2 0 0 0 0 R 100.0 0.0 6:55.54 acpi_pad/31
20886 root -2 0 0 0 0 R 99.7 0.0 7:08.80 acpi_pad/12
20878 root -2 0 0 0 0 R 98.4 0.0 7:12.20 acpi_pad/4
20905 root -2 0 0 0 0 R 98.4 0.0 6:55.85 acpi_pad/30
 3049 newrelic 20 0 245800 8388 4724 S 22.3 0.0 0:14.74 nrsysmond
22126 root 20 0 19592 3876 2392 R 6.0 0.0 0:00.99 top
 1441 root 39 19 0 0 0 S 3.4 0.0 3:05.47 kipmi0
20720 root 20 0 870276 13080 6208 S 1.6 0.0 0:01.50 collectd
    8 root 20 0 0 0 0 S 0.9 0.0 0:03.19 rcu_sched
   13 root rt 0 0 0 0 S 0.3 0.0 0:00.03 watchdog/0

This has been seen on the 4.2 and 4.4 kernels. I believe the LINPACK test suite was running in all cases this was seen. However, it occurs pretty infrequently, and I don't know how to reliably recreate the issue. It has only been seen on the DGX-1 Server, not on the DGX Station. I'm not sure if any other systems have seen it.

Another data point which may or may not be relevant is that C-states and P-states are enabled.

We can workaround this issue by blacklisting the acpi_pad module, or by using the acpi_pad.disable=1 kernel bootarg. What are the implications of disabling acpi_pad are?

Googling "acpi_pad uses up all the resource" returns many hits where they all suggest to simply disable it.

Tags: cscc
Revision history for this message
Ivan Hu (ivan.hu) wrote :

Processor Aggregator Device is introduced on ACPI 4.0 as a mechanism for platforms to ask the OS to force processors to enter (power saving) idle., in order to reduce the platform’s power consumption.

Some description on ACPI spec:
The following section describes the definition and operation of the optional Processor Aggregator
device. The Processor Aggregator Device provides a control point that enables the platform to
perform specific processor configuration and control that applies to all processors in the platform. The Plug and Play ID of the Processor Aggregator Device is ACPI000C.

Acpi_pad driver will do the job for processors to enter idle when platform requests via notification and acpi _PUR methods.
Disable the acpi_pad driver will not put the processors to idle and reduce some power consumption when platform requests.

Newton Liu (newtonl)
information type: Proprietary → Public
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1639299

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Newton Liu (newtonl)
information type: Public → Private
information type: Private → Private Security
information type: Private Security → Public
Revision history for this message
JCarsique (julien-carsique) wrote :

Cannot run apport-collect but confirmed on Ubuntu 16.04.6 LTS 4.4.0-104-generic.
Issued "rmmod acpi_pad" to get the system back, then blacklisted it in /etc/modprobe.d/blacklist.conf.

Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.