Comment 0 for bug 1867155

Revision history for this message
Po-Hsu Lin (cypressyew) wrote : P8 node modoc will reboot when running the sru_misc

Tested with 5 attempts, 4 hangs around the following test in net sub-category:
 # selftests: net: reuseport_bpf_cpu

First attempt:
23:21:32 DEBUG| [stdout] ok 2 selftests: net: reuseport_bpf_cpu
23:21:32 DEBUG| [stdout] # selftests: net: reuseport_bpf_numa
23:21:32 DEBUG| [stdout] # ---- IPv4 UDP ----
(hang here)

Second attempt:
10:17:35 DEBUG| [stdout] ok 1 selftests: net: reuseport_bpf
10:17:35 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu
10:17:35 DEBUG| [stdout] # ---- IPv4 UDP ----
10:17:35 DEBUG| [stdout] # send cpu 0, receive socket 0
(line skipped)
10:17:35 DEBUG| [stdout] # send cpu 159, receive socket 159
10:17:35 DEBUG| [stdout] # ---- IPv6 TCP ----
(hang here)

Third attempt failed because of test timeout:
12:46:16 DEBUG| [stdout] # [FAIL]
12:46:16 DEBUG| [stdout] # --------------------
12:46:16 DEBUG| [stdout] # running psock_tpacket test
12:46:16 DEBUG| [stdout] # --------------------
13:14:13 INFO | Timer expired (1800 sec.), nuking pid 161853

Fourth attempt:
07:41:51 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu
07:41:51 DEBUG| [stdout] # ---- IPv4 UDP ----
07:41:51 DEBUG| [stdout] # send cpu 0, receive socket 0
(lines skipped)
07:41:51 DEBUG| [stdout] # send cpu 159, receive socket 159
07:41:51 DEBUG| [stdout] # ---- IPv6 UDP ----
07:41:51 DEBUG| [stdout] # send cpu 0, receive socket 0
07:41:51 DEBUG| [stdout] # send cpu 1, receive socket 1
(lines skipped)
07:41:51 DEBUG| [stdout] # send cpu 157, receive socket 157
07:41:51 DEBUG| [stdout] # send cpu 159, receive socket 159
07:41:51 DEBUG| [stdout] # ---- IPv4 TCP ----
(test hang here)

Fifth attempt:
04:29:17 DEBUG| [stdout] ok 1 selftests: net: reuseport_bpf
04:29:17 DEBUG| [stdout] # selftests: net: reuseport_bpf_cpu
04:29:17 DEBUG| [stdout] # ---- IPv4 UDP ----
04:29:17 DEBUG| [stdout] # send cpu 0, receive socket 0
(lines skipped)
04:29:17 DEBUG| [stdout] # send cpu 159, receive socket 159
04:29:17 DEBUG| [stdout] # ---- IPv6 UDP ----
04:29:17 DEBUG| [stdout] # send cpu 0, receive socket 0
(lines skipped)
04:29:17 DEBUG| [stdout] # send cpu 159, receive socket 159
04:29:17 DEBUG| [stdout] # ---- IPv4 TCP ----
04:29:17 DEBUG| [stdout] # send cpu 0, receive socket 0
(lines skipped)
04:29:17 DEBUG| [stdout] # send cpu 15, receive socket 15
(test hang here)

I tried to watch dmesg when this happens, but there is no information there, the system will be reboot automatically silently.

Maybe we need to use IPMI to see if there is anything on the console.

ProblemType: Bug
DistroRelease: Ubuntu 19.10
Package: linux-image-5.3.0-42-generic 5.3.0-42.34
ProcVersionSignature: Ubuntu 5.3.0-42.34-generic 5.3.18
Uname: Linux 5.3.0-42-generic ppc64le
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Mar 12 04:33 seq
 crw-rw---- 1 root audio 116, 33 Mar 12 04:33 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.11-0ubuntu8.5
Architecture: ppc64el
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Date: Thu Mar 12 09:42:24 2020
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
 Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=C.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcKernelCmdLine: root=UUID=b2a867ce-7813-4785-8861-4e7de2ac39b4 ro console=hvc0
ProcLoadAvg: 0.07 0.02 0.00 1/1461 86637
ProcLocks:
 1: POSIX ADVISORY WRITE 3799 00:18:841 0 EOF
 2: POSIX ADVISORY WRITE 3526 00:18:743 0 EOF
 3: FLOCK ADVISORY WRITE 3720 00:18:837 0 EOF
ProcSwaps:
 Filename Type Size Used Priority
 /swap.img file 8388544 0 -2
ProcVersion: Linux version 5.3.0-42-generic (buildd@bos02-ppc64el-006) (gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu2)) #34-Ubuntu SMP Fri Feb 28 05:49:17 UTC 2020
RelatedPackageVersions:
 linux-restricted-modules-5.3.0-42-generic N/A
 linux-backports-modules-5.3.0-42-generic N/A
 linux-firmware 1.183.4
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
VarLogDump_list: total 0
cpu_cores: Number of cores present = 20
cpu_coreson: Number of cores online = 20
cpu_dscr: DSCR is 0
cpu_freq:
 min: 3.694 GHz (cpu 159)
 max: 3.695 GHz (cpu 1)
 avg: 3.694 GHz
cpu_runmode:
 Could not retrieve current diagnostics mode,
 No kernel interface to firmware
cpu_smt: SMT=8