clock test in monotonic_time will fail on Azure Instances

Bug #1774959 reported by Po-Hsu Lin on 2018-06-04
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
Undecided
Unassigned
linux-azure (Ubuntu)
Undecided
Unassigned

Bug Description

Among all of the Azure node, this test only failed on these two nodes:

Seen on the following instances:

Basic_A2
Standard_D2_v3
Standard_D2s_v3
Standard_E4s_v3
Standard_E64-16s
Standard_F2s_v2

Steps to reproduce:
 1. git clone --depth=1 git://kernel.ubuntu.com/ubuntu/autotest-client-tests
 2. make -C autotest-client-tests/monotonic_time/src
 3. sudo autotest-client-tests/monotonic_time/src/time_test --duration 300 tsc

Results:
$ sudo autotest-client-tests/monotonic_time/src/time_test --duration 300 tsc

 Running 'make '
 cc -O -std=gnu99 -Wall -c -o time_test.o time_test.c
 cc -O -std=gnu99 -Wall -c -o cpuset.o cpuset.c
 cc -O -std=gnu99 -Wall -c -o threads.o threads.c
 cc -O -std=gnu99 -Wall -c -o logging.o logging.c
 cc -o time_test time_test.o cpuset.o threads.o logging.o -lpthread -lrt
 Running '/home/azure/autotest/client/tmp/monotonic_time/src/time_test --duration 300 clock'
 Time test command exit status: 1
 Exception escaping from test:
 Traceback (most recent call last):
 File "/home/azure/autotest/client/shared/test.py", line 411, in
 _exec_call_test_function(self.execute, *p_args, **p_dargs)
 File "/home/azure/autotest/client/shared/test.py", line 823, in _call_test_function
 return func(*args, **dargs)
 File "/home/azure/autotest/client/shared/test.py", line 291, in execute
 postprocess_profiled_run, args, dargs)
 File "/home/azure/autotest/client/shared/test.py", line 212, in _call_run_once
 self.run_once(*args, **dargs)
 File "/home/azure/autotest/client/tests/monotonic_time/monotonic_time.py", line 53, in run_once
 raise error.TestFail(line)
 TestFail: FAIL: clock-worst-warp=-100

Sean Feole (sfeole) on 2018-11-22
summary: clock test in monotonic_time will fail on Azure Standard_E4s_v3 /
- Standard_E64-16s
+ Standard_E64-16s / AWS m5a.large

Added Logs from AWS instance failure. This test only appears to fail on the instance types backed via AMD EPYC 7000 series processors with an all core turbo clock speed of 2.5 GHz

may or may not be relevant info

11/22 00:04:50 INFO |monotonic_:0048| Time test command exit status: 1
11/22 00:04:50 ERROR| test:0414| Exception escaping from test:
Traceback (most recent call last):
  File "/home/ubuntu/autotest/client/shared/test.py", line 411, in _exec
    _call_test_function(self.execute, *p_args, **p_dargs)
  File "/home/ubuntu/autotest/client/shared/test.py", line 823, in _call_test_f
unction
    return func(*args, **dargs)
  File "/home/ubuntu/autotest/client/shared/test.py", line 291, in execute
    postprocess_profiled_run, args, dargs)
  File "/home/ubuntu/autotest/client/shared/test.py", line 212, in _call_run_on
ce
    self.run_once(*args, **dargs)
  File "/home/ubuntu/autotest/client/tests/monotonic_time/monotonic_time.py", l
ine 54, in run_once
    raise error.TestFail(line)
TestFail: FAIL: tsc-worst-warp=-44

Sean Feole (sfeole) wrote :

We still see random fails on Azure instances, it does not appear to be isolated to a particular instance

03:19:44 DEBUG| Running '/home/azure/autotest/client/tmp/monotonic_time/src/time_test --duration 300 tsc_lfence'
03:24:44 INFO | Time test command exit status: 1
03:24:44 ERROR| Exception escaping from test:
Traceback (most recent call last):
  File "/home/azure/autotest/client/shared/test.py", line 411, in _exec
    _call_test_function(self.execute, *p_args, **p_dargs)
  File "/home/azure/autotest/client/shared/test.py", line 823, in _call_test_function
    return func(*args, **dargs)
  File "/home/azure/autotest/client/shared/test.py", line 291, in execute
    postprocess_profiled_run, args, dargs)
  File "/home/azure/autotest/client/shared/test.py", line 212, in _call_run_once
    self.run_once(*args, **dargs)
  File "/home/azure/autotest/client/tests/monotonic_time/monotonic_time.py", line 54, in run_once
    raise error.TestFail(line)
TestFail: FAIL: tsc_lfence-worst-warp=-30

Changed in ubuntu-kernel-tests:
status: New → Confirmed
Changed in linux-azure (Ubuntu):
status: New → Confirmed
description: updated
summary: - clock test in monotonic_time will fail on Azure Standard_E4s_v3 /
- Standard_E64-16s / AWS m5a.large
+ clock test in monotonic_time will fail on Azure Instances
Sean Feole (sfeole) wrote :
Download full text (3.2 KiB)

Failures on Azure still crop up from time to time even after fixes have been applied. Latest batch is on Standard_E2s_v3 instances.

04/26 21:05:17 DEBUG| utils:0116| Running '/home/azure/autotest/client/tmp/monotonic_time/src/time_test --duration 300 tsc_lfence'
04/26 21:10:17 INFO |monotonic_:0048| Time test command exit status: 1
04/26 21:10:17 ERROR| test:0414| Exception escaping from test:
Traceback (most recent call last):
  File "/home/azure/autotest/client/shared/test.py", line 411, in _exec
    _call_test_function(self.execute, *p_args, **p_dargs)
  File "/home/azure/autotest/client/shared/test.py", line 823, in _call_test_function
    return func(*args, **dargs)
  File "/home/azure/autotest/client/shared/test.py", line 291, in execute
    postprocess_profiled_run, args, dargs)
  File "/home/azure/autotest/client/shared/test.py", line 212, in _call_run_once
    self.run_once(*args, **dargs)
  File "/home/azure/autotest/client/tests/monotonic_time/monotonic_time.py", line 54, in run_once
    raise error.TestFail(line)
TestFail: FAIL: tsc_lfence-worst-warp=-3
04/26 21:10:17 ERROR| parallel:0033| child process failed
04/26 21:10:17 DEBUG| parallel:0037| Traceback (most recent call last):
04/26 21:10:17 DEBUG| parallel:0037| File "/home/azure/autotest/client/parallel.py", line 25, in fork_start
04/26 21:10:17 DEBUG| parallel:0037| l()
04/26 21:10:17 DEBUG| parallel:0037| File "/home/azure/autotest/client/job.py", line 505, in <lambda>
04/26 21:10:17 DEBUG| parallel:0037| l = lambda: test.runtest(self, url, tag, args, dargs)
04/26 21:10:17 DEBUG| parallel:0037| File "/home/azure/autotest/client/test.py", line 125, in runtest
04/26 21:10:17 DEBUG| parallel:0037| job.sysinfo.log_after_each_iteration)
04/26 21:10:17 DEBUG| parallel:0037| File "/home/azure/autotest/client/shared/test.py", line 913, in runtest
04/26 21:10:17 DEBUG| parallel:0037| mytest._exec(args, dargs)
04/26 21:10:17 DEBUG| parallel:0037| File "/home/azure/autotest/client/shared/test.py", line 411, in _exec
04/26 21:10:17 DEBUG| parallel:0037| _call_test_function(self.execute, *p_args, **p_dargs)
04/26 21:10:17 DEBUG| parallel:0037| File "/home/azure/autotest/client/shared/test.py", line 823, in _call_test_function
04/26 21:10:17 DEBUG| parallel:0037| return func(*args, **dargs)
04/26 21:10:17 DEBUG| parallel:0037| File "/home/azure/autotest/client/shared/test.py", line 291, in execute
04/26 21:10:17 DEBUG| parallel:0037| postprocess_profiled_run, args, dargs)
04/26 21:10:17 DEBUG| parallel:0037| File "/home/azure/autotest/client/shared/test.py", line 212, in _call_run_once
04/26 21:10:17 DEBUG| parallel:0037| self.run_once(*args, **dargs)
04/26 21:10:17 DEBUG| parallel:0037| File "/home/azure/autotest/client/tests/monotonic_time/monotonic_time.py", line 54, in run_once
04/26 21:10:17 DEBUG| parallel:0037| raise error.TestFail(line)
04/26 21:10:17 DEBUG| parallel:0037| TestFail: FAIL: tsc_lfence-worst-warp=-3
04/26 21:10:17 INFO | job:0215| FAIL monotonic_time.tsc monotonic_time.tsc timestamp=1556313017 localtime=Apr 26 21:10:17 FAIL: tsc_lfence-worst-warp=-3
04/26 21:10:17 INFO | ...

Read more...

Sean Feole (sfeole) wrote :

Stats for Comment #3
Kernel: 5.0.0-1005.5-azure
Cloud: Azure
Series: Disco

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers