test_regression_testsuite from ubuntu_qrt_apparmor failed on Focal zVM / B-GCP-5.4

Bug #1876697 reported by Po-Hsu Lin
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QA Regression Testing
Invalid
Undecided
Unassigned
ubuntu-kernel-tests
Fix Released
Undecided
Unassigned
linux (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Issue found on zVM "kernel04" with 5.4.0-29.33

 ======================================================================
 FAIL: test_regression_testsuite (__main__.ApparmorTestsuites)
 Run kernel regression tests
 ----------------------------------------------------------------------
 Traceback (most recent call last):
   File "./test-apparmor.py", line 1746, in test_regression_testsuite
     self.assertEqual(expected, rc, result + report)
 AssertionError: Got exit code 2, expected 0

 running aa_exec

 running access
 xfail: ACCESS file rx (r)
 xfail: ACCESS file rwx (r)
 xfail: ACCESS file r (wx)
 xfail: ACCESS file rx (wx)
 xfail: ACCESS file rwx (wx)
 xfail: ACCESS dir rwx (r)
 xfail: ACCESS dir r (wx)
 xfail: ACCESS dir rx (wx)
 xfail: ACCESS dir rwx (wx)

 running at_secure

 running introspect

 running capabilities
         (ptrace)
         (sethostname)
         (setdomainname)
         (setpriority)
         (setscheduler)
 Error: syscall_setscheduler failed. Test 'syscall_setscheduler -- unconfined' was expected to 'pass'. Reason for failure 'FAIL: Can't set SCHED_RR: Operation not permitted'
 Error: syscall_setscheduler failed. Test 'syscall_setscheduler -- all caps' was expected to 'pass'. Reason for failure 'FAIL: Can't set SCHED_RR: Operation not permitted'
   preparing apparmor_2.13.3-7ubuntu5.dsc... done
 Error: syscall_setscheduler failed. Test 'syscall_setscheduler -- capability sys_nice' was expected to 'pass'. Reason for failure 'FAIL: Can't set SCHED_RR: Operation not permitted'
 Error: changehat_wrapper failed. Test 'syscall_setscheduler changehat -- all caps' was expected to 'pass'. Reason for failure 'FAIL: Can't set SCHED_RR: Operation not permitted'
 Error: changehat_wrapper failed. Test 'syscall_setscheduler changehat -- capability sys_nice' was expected to 'pass'. Reason for failure 'FAIL: Can't set SCHED_RR: Operation not permitted'
         (reboot)
         (chroot)
         (mlockall)
         (net_raw)

Please find attachment for the complete test log.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

The qrt test suite was not tested on other s390x instances yet (LPAR / zKVM)

tags: added: s390x sru-20200427 ubuntu-qrt-apparmor
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

qrt test suite passed on ARM64 / AMD64 / PowerPC (P8) with this kernel.

P9 was not tested yet as well.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1876697

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: focal
Revision history for this message
Steve Beattie (sbeattie) wrote : Re: test_regression_testsuite from ubuntu_qrt_apparmor failed on Focal zVM

I have seen a similar failure with that specific test when running the tests under virtualbox on x86, though I have not tried it in several years.

If this is the expected behavior going forward on s390s, we can address it in qa-regression-testing.

Thanks.

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

My guess is that the test might be hitting the following code from kernel/sched/core.c:

#ifdef CONFIG_RT_GROUP_SCHED
                /*
                 * Do not allow realtime tasks into groups that have no runtime
                 * assigned.
                 */
                if (rt_bandwidth_enabled() && rt_policy(policy) &&
                                task_group(p)->rt_bandwidth.rt_runtime == 0 &&
                                !task_group_is_autogroup(task_group(p))) {
                        retval = -EPERM;
                        goto unlock;
                }
#endif

Which is not really about the capabilities. That is just a guess.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

This issue can be found on older Focal kernels as well (since 5.4.0-24.28, first failure found on zVM, however zKVM passed and LPAR was not tested)

With this kernel 5.4.0-29.33, it can be found on LPAR as well (zKVM not tested)
But not with other kernels.

Revision history for this message
Stefan Bader (smb) wrote :

Reading about SCHED_RR, we had some reports in focal which were about issues setting real-time priority to processes and that appeared to be caused by enabling CONFIG_RT_GROUP_SCHED (which was a request by docker). That happened before release and after 5.4.0-21 (see bug #1873315). And Thadeu's commment #6 points to code which is active only when that config is turned on.

So this is something that we released with and the question is whether it should be "fixed" by turning this off again or having properly handled the special cases.

Revision history for this message
Steve Beattie (sbeattie) wrote :

All that about CONFIG_RT_GROUP_SCHED seems sensible, but then I am confused as to why is it only showing up in s390x environments?

The test is trying to exercise CAP_SYS_NICE, and doing so by calling

  setpriority(PRIO_PROCESS, 0, -5)

Does the test needs to be put into a cgroup with rt allocations if CONFIG_RT_GROUP_SCHED is set?

Revision history for this message
Seth Arnold (seth-arnold) wrote :

This feels related to https://bugs.launchpad.net/ubuntu/+source/rtkit/+bug/1875665 which was filed by amd64 users.

Po-Hsu Lin (cypressyew)
tags: added: 5.4
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

This issue can be found on B-GCP-5.4
5.4.0-1011.11~18.04.1

tags: added: kqa
tags: added: kqa-blocker
removed: kqa sru-20200427
tags: added: sru-20200518
summary: - test_regression_testsuite from ubuntu_qrt_apparmor failed on Focal zVM
+ test_regression_testsuite from ubuntu_qrt_apparmor failed on Focal zVM /
+ B-GCP-5.4
Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

So, B-GCP-5.4 as well as F-GCP 5.4.0-1011 have RT_GROUP_SCHED on, and that has been turned off by later versions, like F-GCP 5.4.0-1012 and B-GCP-5.4 5.4.0-1016.

@cypressyew, can you confirm that it doesn't happen on such kernels?

Thanks.
Cascardo.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hello Thadeu,

ubuntu_qrt_apparmor test has passed on B-GCP-5.4 (5.4.0-1021.21~18.04.1) and F-GCP 5.4.0-1016.16

Thanks.
Sam

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Also,
it has passed with all arches including zVM / LPAR on Focal 5.4.0-44.48 as well.

On Focal kernel config:
# CONFIG_RT_GROUP_SCHED is not set

Commit from Seth:
b7ac4514cc56ec
UBUNTU: [Config] Turn off CONFIG_RT_GROUP_SCHED everywhere

I think we can close this bug now.
Thanks

Changed in linux (Ubuntu):
status: Incomplete → Fix Released
Changed in ubuntu-kernel-tests:
status: New → Fix Released
Changed in qa-regression-testing:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.