ubuntu_qrt_apparmor test will hang on azure nodes

Bug #1763002 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
New
High
Unassigned
linux-azure (Ubuntu)
New
Undecided
Unassigned

Bug Description

The ubuntu_qrt_apparmor test will hang on the following azure testing nodes:
 * DS11_v2_promo
 * DS4
 * D4s_v3
 * E4s_v3
 * GS2
 * GS4-4
 * GS5-8
 * L8s
 * DS14_v2_promo
 * DS13_v2
 * B1s
 * D32s_v3
 * E4s
 * E32-8s
 * E64-16s

This is reproducible on every cycle, so I think this is not a regression.

From the job execution itself, it looks like the test was finished.
the "test_regression_testsuite" timed out and got terminated, log here:
https://pastebin.ubuntu.com/p/dSNprxDwdJ/

But it will stay in this state until the tester cancels the job, checking on the node you will see a zombie process:
  $ ps aux | grep Z
  USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
  azure 2984 0.0 0.0 0 0 ? Z Apr10 0:00 [sh] <defunct>

  $ pstree -p -s 2984
  systemd(1)───sshd(1247)───sshd(2559)───sshd(2634)───python3(2635)───sh(2984)

  $ ps aux | grep 2635
  azure 2635 0.0 0.0 124748 14188 ? Ssl Apr10 0:00 python3 ckct/runner --cloud azure ubuntu_qrt_apparmor

dmesg: http://paste.ubuntu.com/p/Qdx8KThG9W/

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-1004-azure 4.15.0-1004.4
ProcVersionSignature: User Name 4.15.0-1004.4-username 4.15.15
Uname: Linux 4.15.0-1004-azure x86_64
ApportVersion: 2.20.9-0ubuntu4
Architecture: amd64
Date: Wed Apr 11 11:34:24 2018
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=C.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-azure
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
description: updated
description: updated
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Giving this high priority as it will affect the Jenkins automation.

Tester will need to ssh into this system and kill the parent of the zombie process.

Changed in ubuntu-kernel-tests:
importance: Undecided → High
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

For the recent cycle with Bionic kernel, it's failing with:
* Standard_GS5-8
* Standard_DS13_v2
* Standard_DS13_v2
* Standard_DS4_v2
* Standard_B4ms

Some nodes listed in the bug description has passed for this cycle. So I think it has something to do with the HW that it got allocated.

Need to test them manually.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.