CPU hard lockup when turning CPU back online on Bionic P9
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
The Ubuntu-power-systems project |
Fix Released
|
High
|
bugproxy |
Bug Description
Found on another Boston Power9 box "dradis".
Steps to reproduce:
1. Check online CPUs
$ cat /sys/devices/
0-159
2. Do a CPU hotplug to take one off:
$ echo 0 | sudo tee /sys/devices/
0
3. Check dmesg, you should see:
[ 410.890106] IRQ 174: no longer affine to CPU159
4. Put that CPU back online and check dmesg again:
$ echo 1 | sudo tee /sys/devices/
System complains about CPU hard lockup:
[ 410.890106] IRQ 174: no longer affine to CPU159
[ 421.168052] Watchdog CPU:128 Hard LOCKUP
[ 421.168054] Modules linked in: joydev input_leds mac_hid idt_89hpesx ipmi_powernv opal_prd ipmi_devintf ibmpowernv ofpart at24 cmdlinepart uio_pdrv_genirq uio powernv_flash mtd ipmi_msghandler vmx_crypto sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_
[ 421.168108] CPU: 128 PID: 778 Comm: watchdog/128 Not tainted 4.15.0-48-generic #51-Ubuntu
[ 421.168109] NIP: c000000000d082e8 LR: c00000000016c3b0 CTR: c000000000ac5d80
[ 421.168111] REGS: c00000003f9ffd80 TRAP: 0900 Not tainted (4.15.0-48-generic)
[ 421.168112] MSR: 9000000000009033 <SF,HV,
[ 421.168118] CFAR: c00000000016c3ac SOFTE: 0
[ 421.168142] NIP [c000000000d082e8] _raw_spin_
[ 421.168146] LR [c00000000016c3b0] update_
[ 421.168147] Call Trace:
[ 421.168150] [c000200e55743af0] [c00000000171dd78] __per_cpu_
[ 421.168154] [c000200e55743b20] [c00000000016c2b0] update_
[ 421.168156] [c000200e55743bb0] [c00000000016c7bc] dequeue_
[ 421.168159] [c000200e55743bf0] [c00000000014e9b0] deactivate_
[ 421.168161] [c000200e55743c70] [c000000000d0187c] __schedule+
[ 421.168164] [c000200e55743d40] [c000000000d01ff0] schedule+0x40/0xc0
[ 421.168167] [c000200e55743d60] [c000000000144bd4] smpboot_
[ 421.168169] [c000200e55743dc0] [c00000000013e7e8] kthread+0x1a8/0x1b0
[ 421.168172] [c000200e55743e30] [c00000000000b658] ret_from_
[ 421.168173] Instruction dump:
[ 421.168175] 7c0803a6 4bffff98 3c4c009e 38423140 7c0802a6 60000000 fbe1fff8 f821ffd1
[ 421.168179] 7c7f1b78 39400000 994d028d 814d0008 <7d201829> 2c090000 40c20010 7d40192d
But the CPU is actually back online:
$ cat /sys/devices/
0-159
$ cat /sys/devices/
1
ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-
ProcVersionSign
Uname: Linux 4.15.0-48-generic ppc64le
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 May 2 08:06 seq
crw-rw---- 1 root audio 116, 33 May 2 08:06 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.6
Architecture: ppc64el
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Date: Thu May 2 08:16:36 2019
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
PciMultimedia:
ProcFB: 0 astdrmfb
ProcKernelCmdLine: root=UUID=
ProcLoadAvg: 0.02 0.42 0.37 1/1332 5365
ProcLocks:
1: FLOCK ADVISORY WRITE 4460 00:17:336 0 EOF
2: POSIX ADVISORY WRITE 3994 00:17:604 0 EOF
3: FLOCK ADVISORY WRITE 3913 00:17:586 0 EOF
4: POSIX ADVISORY WRITE 3984 00:17:620 0 EOF
5: POSIX ADVISORY WRITE 1816 00:17:356 0 EOF
ProcSwaps:
Filename Type Size Used Priority
/swap.img file 8388544 0 -2
ProcVersion: Linux version 4.15.0-48-generic (buildd@
RelatedPackageV
linux-
linux-
linux-firmware 1.173.5
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
VarLogDump_list: total 0
cpu_cores: Number of cores present = 40
cpu_coreson: Number of cores online = 40
cpu_dscr: DSCR is 16
cpu_freq:
min: 2.862 GHz (cpu 159)
max: 2.862 GHz (cpu 81)
avg: 2.862 GHz
cpu_runmode:
Could not retrieve current diagnostics mode,
No kernel interface to firmware
cpu_smt: SMT=4
Changed in ubuntu-power-systems: | |
importance: | Undecided → High |
assignee: | nobody → bugproxy (bugproxy) |
tags: | added: architecture-ppc64le bugnameltc-177391 severity-high targetmilestone-inin--- |
no longer affects: | linux (Ubuntu) |
This change was made by a bot.