Max jitter for 24 hours cyclictest in guest is larger than expected

Bug #1803615 reported by Ghada Khalil
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Alexander Kozyrev

Bug Description

Brief Description
-----------------
Max jitter for 24 hours cyclictest in guest is larger than expected

Severity
--------
Major

Steps to Reproduce
------------------
1. create flavor with "hw:cpu_realtime_mask": "^15,0-14",
nova flavor-show f.extra
+----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property | Value |
+----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| OS-FLV-DISABLED:disabled | False |
| OS-FLV-EXT-DATA:ephemeral | 0 |
| disk | 8 |
| extra_specs | {"hw:cpu_policy": "dedicated", "hw:mem_page_size": "1048576", "hw:cpu_sockets": "1", "hw:cpu_model": "Passthrough", "hw:cpu_realtime_mask": "^15,0-14", "aggregate_instance_extra_specs:storage": "local_image", "hw:cpu_realtime": "yes"} |
| id | 712f30a9-b6ea-487b-b764-55c97af9ccda |
| name | f.extra |
| os-flavor-access:is_public | True |
| ram | 8192 |
| rxtx_factor | 1.0 |
| swap | |
| vcpus | 16 |
+----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2. Start the VM with above flavor.

3. Update guest cmdline
rw root=LABEL=wrs_guest clocksource=pit console=tty0 console=ttyS0 biosdevname=0 net.ifnames=0 no_timer_check audit=0 cgroup_disable=memory isolcpus=1-14 irqaffinity=0 nmi_watchdog=0 softlockup_panic=0 intel_idle.max_cstate=0 processor.max_cstate=1 idle=poll mce=ignore_ce rcu_nocbs=1-14 kthread_cpus=0 nohz_full=1-14 initrd=initramfs.img BOOT_IMAGE=vmlinuz
4. Insert the ptp module on guest to sync up time with host clock
$ modprobe ptp_kvm
You can use command “lsmod | grep ptp_kvm” to check the ptp_kvm module is already installed.
You can also create a ptp_kvm.conf file to make this module installed after system boot up.
$ echo ptp_kvm > /etc/modules-load.d/ptp_kvm.conf
Run below command to make guest time always sync up with host time
$ echo "refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0" >> /etc/chrony.conf
$ systemctl restart chronyd
$ chronyc sources | grep PHC0

5. run cyclictest for 24 hours in the guest.
./cyclictest -S -p99 -n -m -d0 -A ffff
# /dev/cpu_dma_latency set to 0us
policy: fifo: loadavg: 0.00 0.01 0.05 1/300 5743

T: 0 ( 3749) P:99 I:1000 C:153698899 Min: 3 Act: 4 Avg: 4 Max: 349
T: 1 ( 3750) P:99 I:1000 C:153698900 Min: 4 Act: 7 Avg: 6 Max: 104
T: 2 ( 3751) P:99 I:1000 C:153698886 Min: 4 Act: 7 Avg: 6 Max: 85
T: 3 ( 3752) P:99 I:1000 C:153698887 Min: 4 Act: 7 Avg: 6 Max: 100
T: 4 ( 3753) P:99 I:1000 C:153698882 Min: 4 Act: 6 Avg: 6 Max: 93
T: 5 ( 3754) P:99 I:1000 C:153698879 Min: 4 Act: 7 Avg: 6 Max: 94
T: 6 ( 3755) P:99 I:1000 C:153698876 Min: 5 Act: 7 Avg: 6 Max: 94
T: 7 ( 3756) P:99 I:1000 C:153698873 Min: 4 Act: 7 Avg: 6 Max: 99
T: 8 ( 3757) P:99 I:1000 C:153698869 Min: 5 Act: 7 Avg: 7 Max: 99
T: 9 ( 3758) P:99 I:1000 C:153698866 Min: 4 Act: 7 Avg: 6 Max: 103
T:10 ( 3759) P:99 I:1000 C:153698863 Min: 5 Act: 7 Avg: 7 Max: 107
T:11 ( 3760) P:99 I:1000 C:153698860 Min: 4 Act: 7 Avg: 6 Max: 112
T:12 ( 3761) P:99 I:1000 C:153698858 Min: 4 Act: 6 Avg: 6 Max: 102
T:13 ( 3762) P:99 I:1000 C:153698858 Min: 4 Act: 7 Avg: 6 Max: 101
T:14 ( 3763) P:99 I:1000 C:153698858 Min: 4 Act: 7 Avg: 6 Max: 100
T:15 ( 3764) P:99 I:1000 C:153698858 Min: 4 Act: 7 Avg: 6 Max: 103

Expected Behavior
------------------
Expected max jitter should be less than 20us

Actual Behavior
----------------
max jitter is greater than 100us

Reproducibility
---------------
Reproducible

System Configuration
--------------------
Hardware/Software Configuration for compute node
Server Model Skylake-SP (Wolf Pass) Comments
CPU model Model name: Intel® Xeon® Gold 6148 CPU @ 2.40GHz
Socket number 2
Cores per socket 20
NUMA nodes 2
NUMA node0 CPUs 0-19
NUMA node1 CPUs 20-39
Memory DDR4 2666MHz 16Gx6

Pre-condition: disable SMI and set Maximum performance profile on BIOS

VM Guest OS: 3.10.0-693.21.1.rt56.639.el7.tis.42.x86_64

Branch/Pull Time/Commit
-----------------------
any starlingX build

Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Alex Kozyrev (akozyrev)
status: New → Triaged
importance: Undecided → High
tags: added: stx.2019.03 stx.config
Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-config (master)

Reviewed: https://review.openstack.org/618308
Committed: https://git.openstack.org/cgit/openstack/stx-config/commit/?id=c3564823d4146fb0cabebb8d368b6f25218f3572
Submitter: Zuul
Branch: master

commit c3564823d4146fb0cabebb8d368b6f25218f3572
Author: Alex Kozyrev <email address hidden>
Date: Thu Nov 15 14:04:24 2018 -0500

    Disable QEMU memory balloon usage statistics.

    Set mem_stats_period_seconds in nova.conf to 0.
    It will disable QEMU memory balloon usage statistics.
    There are 2 reasons of doing that in StarlingX:
    1. StarlingX doesn’t support memory overcommit and adding
    QEMU memory balloon device doesn’t make any difference.
    2. QEMU memory balloon usage statistics interrupts a VM run
    and causes unacceptable jitter in cyclictest once in a while.

    Closes-bug: 1803615
    Change-Id: Iaea1962601755736688f2deb61730ab1d548b8b1
    Signed-off-by: Alex Kozyrev <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-integ (master)

Fix proposed to branch: master
Review: https://review.openstack.org/624461

Ghada Khalil (gkhalil)
Changed in starlingx:
status: Fix Released → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-integ (master)

Reviewed: https://review.openstack.org/624461
Committed: https://git.openstack.org/cgit/openstack/stx-integ/commit/?id=52bef031ac6e52c73a0a6a680b0ef31b99baac71
Submitter: Zuul
Branch: master

commit 52bef031ac6e52c73a0a6a680b0ef31b99baac71
Author: Alex Kozyrev <email address hidden>
Date: Tue Dec 11 13:42:07 2018 -0500

    Provide a way to set mem_stats_period_seconds in puppet-nova.

    There is no support of mem_stats_period_seconds in puppet-nova now.
    We need to add a way to set it to 0 to disable QEMU memory balloon statistics.
    The intention is to help with cyclictest spikes due to stats collection.

    Depends-On: Iaea1962601755736688f2deb61730ab1d548b8b1
    Change-Id: I1fe3dfede1a5a07ddb5adaff1095206ffe5f6340
    Closes-bug: 1803615
    Signed-off-by: Alex Kozyrev <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-integ (f/centos76)

Fix proposed to branch: f/centos76
Review: https://review.openstack.org/625068

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to stx-integ (f/centos76)
Download full text (5.7 KiB)

Reviewed: https://review.openstack.org/625068
Committed: https://git.openstack.org/cgit/openstack/stx-integ/commit/?id=e25c1acc9b5c77f1bab00288d74ca6df0a8640c3
Submitter: Zuul
Branch: f/centos76

commit 920fcb818c3dd8b0945e6d7bd2371dfb71790f60
Author: zhipengl <email address hidden>
Date: Wed Dec 12 19:42:40 2018 +0800

    Remove last patch of iscsi-initiator-utils

    As we see in the patch, it changes %dir to %ghost to avoid RPM audit.
    If we move the config file mod change to config package and use RPM
    instead of SRPM, we have no audit issue anymore and can ignore related
    change.
    Deployment test pass and related file check pass!

    Story: 2003768
    Task: 28459
    Depends-on: https://review.openstack.org/#/c/624584/

    Change-Id: Ic23ccd740520e1942b3118a84cb03aef5f388332
    Signed-off-by: zhipengl <email address hidden>

commit 52bef031ac6e52c73a0a6a680b0ef31b99baac71
Author: Alex Kozyrev <email address hidden>
Date: Tue Dec 11 13:42:07 2018 -0500

    Provide a way to set mem_stats_period_seconds in puppet-nova.

    There is no support of mem_stats_period_seconds in puppet-nova now.
    We need to add a way to set it to 0 to disable QEMU memory balloon statistics.
    The intention is to help with cyclictest spikes due to stats collection.

    Depends-On: Iaea1962601755736688f2deb61730ab1d548b8b1
    Change-Id: I1fe3dfede1a5a07ddb5adaff1095206ffe5f6340
    Closes-bug: 1803615
    Signed-off-by: Alex Kozyrev <email address hidden>

commit 01f5fdd274ac0bc02528b4630dacaf3ca10eb27a
Author: Steven Webster <email address hidden>
Date: Wed Dec 5 15:29:33 2018 -0500

    Traffic control: fix TC filters for vlan sub-interface

    Sometime after kernel 3.10.0-514.16.1.X, tc filter commands no longer
    match 802.1q packets when the filter protocol is set to 'ip'.

    This poses a problem for a consolidated (eg. infra w/ vlan over
    management) interface configuration.

    The tc filter will operate properly on the vlan interface, but all
    traffic will go to the default qdisc (low priority) when it arrives
    with a vlan tag at the sub-interface.

    This commit sets the filter protocol to '802.1q' in the case of a
    subinterface with a vlan tagged interface ontop of it.

    Some bashate cleanup has also been done on this file.

    Closes-Bug: #1807055
    Change-Id: I457faa2b56bbd270c104cc0313ffe3cc1bfd4db3
    Signed-off-by: Steven Webster <email address hidden>

commit 2ec4482fc766bd583df422c2df5939a2707c7996
Author: zhipengl <email address hidden>
Date: Tue Dec 11 22:51:33 2018 +0800

    Refactor meta patch for facter package

    Merge 2 meta patches as the first meta patch is just overwritted by
    second one.
    Build pass!

    Story: 2003768
    Task: 28458

    Change-Id: I02ccadafa5381c82bcace340f6c399af38aeecc7
    Signed-off-by: zhipengl <email address hidden>

commit 11a4f7a6964bd96f22a02f3394fc2d62447480fa
Author: Eric MacDonald <email address hidden>
Date: Mon Dec 10 19:02:18 2018 -0500

    Package log_functions.sh into platform-util

    The log_functions.sh script file wa...

Read more...

tags: added: in-f-centos76
Ken Young (kenyis)
tags: added: stx.2019.05
removed: stx.2019.03
Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.