cpuhotplug03 in cpuhotplug from ubuntu_ltp failed on some testing nodes

Bug #1836167 reported by Po-Hsu Lin
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
Fix Released
Medium
Krzysztof Kozlowski
linux (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

<<<test_output>>>
incrementing stop
Name: cpuhotplug03
Date: Thu Jul 11 08:31:48 UTC 2019
Desc: Do tasks get scheduled to a newly on-lined CPU?

CPU is 1
sh: echo: I/O error
cpuhotplug03 1 TBROK: CPU1 cannot be offlined
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 4642 0.0 0.0 2020 488 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4643 0.0 0.0 2020 480 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4644 0.0 0.0 2020 468 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4645 0.0 0.0 2020 488 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4646 0.0 0.0 2020 472 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4647 0.0 0.0 2020 480 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4648 0.0 0.0 2020 456 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4649 0.0 0.0 2020 508 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4650 0.0 0.0 2020 488 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4651 0.0 0.0 2020 472 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4652 0.0 0.0 2020 484 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4653 0.0 0.0 2020 496 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4654 0.0 0.0 2020 464 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4655 0.0 0.0 2020 492 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4656 0.0 0.0 2020 448 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4657 0.0 0.0 2020 472 pts/0 R 08:31 0:00 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
root 4661 0.0 0.0 7540 648 pts/0 S 08:31 0:00 grep cpuhotplug_do_spin_loop
cpuhotplug03 1 TINFO: Onlining CPU 1
  7 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  1 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  4 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  0 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  3 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  6 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  1 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  5 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  3 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  7 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  6 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  2 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  4 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  2 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  0 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
  5 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
cpuhotplug03 1 TPASS: 2 cpuhotplug_do_spin_loop processes found on CPU1
<<<execution_status>>>
initiation_status="ok"
duration=1 termination_type=exited termination_id=2 corefile=no
cutime=1060 cstime=9
<<<test_end>>>

Test passed on ThunderX ARM64, probably a test case issue.

Steps to run this test:
  git clone --depth=1 https://github.com/linux-test-project/ltp.git
  cd ltp; make autotools; ./configure; make; sudo make install
  echo "cpuhotplug03 cpuhotplug03.sh -c 1 -l 1" > /tmp/jobs
  sudo /opt/ltp/runltp -f /tmp/jobs

ProblemType: Bug
DistroRelease: Ubuntu 18.10
Package: linux-image-4.18.0-25-generic 4.18.0-25.26
ProcVersionSignature: User Name 4.18.0-25.26-generic 4.18.20
Uname: Linux 4.18.0-25-generic aarch64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Jul 11 06:57 seq
 crw-rw---- 1 root audio 116, 33 Jul 11 06:57 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.10-0ubuntu13.4
Architecture: arm64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
Date: Thu Jul 11 08:22:51 2019
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb: Error: command ['lsusb'] failed with exit code 1:
PciMultimedia:

ProcFB:

ProcKernelCmdLine: console=ttyS0,9600n8r ro
RelatedPackageVersions:
 linux-restricted-modules-4.18.0-25-generic N/A
 linux-backports-modules-4.18.0-25-generic N/A
 linux-firmware 1.175.6
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
tags: added: sru-20190701 ubuntu-ltp
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux (Ubuntu Cosmic):
status: New → Confirmed
Revision history for this message
Po-Hsu Lin (cypressyew) wrote : Re: cpuhotplug03 in cpuhotplug from ubuntu_ltp failed on C Moonshot ARM64

Note that on ThunderX ARM64 sometimes it will fail with:
    cpuhotplug03 1 TFAIL: No cpuhotplug_do_spin_loop processes found on CPU1

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Same failure:
    TFAIL: No cpuhotplug_do_spin_loop processes found on CPU1

Found on amd64 node naumann with Disco kernel.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

This issue could be found on D-AWS bare-metal instance c5.metal

tags: added: sru-20191111
no longer affects: linux (Ubuntu Cosmic)
summary: - cpuhotplug03 in cpuhotplug from ubuntu_ltp failed on C Moonshot ARM64
+ cpuhotplug03 in cpuhotplug from ubuntu_ltp failed on some testing nodes
tags: added: amd64
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Found on E-AWS c5.metal (5.3.0-1009.10)

tags: added: 1202
tags: added: sru-20191202
removed: 1202
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Found on Eoan (5.3.0-25.27) with P8 node modoc.

Po-Hsu Lin (cypressyew)
tags: added: sru-20200106
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Found on Eoan (5.3.0-48.41) with ARM64 "appleton"

Steve Langasek (vorlon)
Changed in linux (Ubuntu Disco):
status: New → Won't Fix
Po-Hsu Lin (cypressyew)
tags: added: sru-20200831
Po-Hsu Lin (cypressyew)
tags: added: sru-20200921
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

On F-oem-5.6 with node glameow:
cpuhotplug03 1 TFAIL: No cpuhotplug_do_spin_loop processes found on CPU1

tags: added: 5.6 focal oem
Revision history for this message
Kelsey Steele (kelsey-steele) wrote :

spotted on Focal aws : 5.4.0-1026.26 : amd64

cpuhotplug03 1 TFAIL: No cpuhotplug_do_spin_loop processes found on CPU1

tags: added: 5.4 aws
Revision history for this message
Kelsey Steele (kelsey-steele) wrote :

Spotted on Focal/azure : 5.4.0-1029.29 : amd64

tags: added: azure
Revision history for this message
Kelsey Steele (kelsey-steele) wrote :

failed on Bionic/azure-4.15 : 4.15.0-1097.107 : amd64

tags: added: 4.15 bionic
Sean Feole (sfeole)
no longer affects: linux (Ubuntu Disco)
tags: added: affects.dgx2 sru-20201109
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Spotted on B-5.4 P9 node baltar

tags: added: sru-20210104
tags: added: sru-20210315
tags: added: hwe
tags: added: 5.8 groovy
tags: added: sru-20210412
Revision history for this message
Krzysztof Kozlowski (krzk) wrote :

The offline error:
"sh: echo: I/O error
cpuhotplug03 1 TBROK: CPU1 cannot be offlined"
seems to be reproducible only on Azure, not on AWS. If AWS was affected by this bug, it could be different issue.

Changed in ubuntu-kernel-tests:
assignee: nobody → Krzysztof Kozlowski (krzk)
Revision history for this message
Krzysztof Kozlowski (krzk) wrote :

For Azure failures sent patch upstream:
https://lists.linux.it/pipermail/ltp/2021-June/023084.html

Changed in ubuntu-kernel-tests:
status: New → In Progress
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Found on 5.11.0-1005.5 - intel

Node spitfire, passed on node bavor.

cpuhotplug03 1 TFAIL: No cpuhotplug_do_spin_loop processes found on CPU1

tags: added: 5.11 sru-20210531
Revision history for this message
Krzysztof Kozlowski (krzk) wrote (last edit ):

Not reproducible on AWS c5.metal (focal 5.4.0-1045-aws, 5.8.0-1035-aws) so dropping AWS from here. Could be a temporary glitch in the test.

There are two different failures in cpuhotplug tests:
1. Offline of CPU1 fails on Azure, because Hyper-V does not support it. This was also reported as lp:1923191 .

2. Offline of CPU1 succeeds, but later test fails with:
utils:0153| [stdout] 49 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
utils:0153| [stdout] 82 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
utils:0153| [stdout] 18 /bin/sh /opt/ltp/testcases/bin/cpuhotplug_do_spin_loop
utils:0153| [stdout] cpuhotplug03 1 TFAIL: No cpuhotplug_do_spin_loop processes found on CPU1
utils:0153| [stdout] tag=cpuhotplug03 stime=1623210736 dur=3 exit=exited stat=1 core=no cu=22260 cs=78
This case was moved to lp:1931390.

=====================
IMPORTANT:
If you see cpuhotplug failure, but offline/online of CPU1 succeed, please go to lp:1931390.

tags: removed: aws hwe oem
Changed in ubuntu-kernel-tests:
importance: Undecided → Medium
Revision history for this message
Krzysztof Kozlowski (krzk) wrote :

Fix committed upstream (LTP).

Changed in ubuntu-kernel-tests:
status: In Progress → Fix Released
Changed in linux (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.