Ingress bw-limit with DPDK does not work

Bug #1936839 reported by Bence Romsics
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Bence Romsics

Bug Description

A colleague of mine working downstream found the following bug (his report follows with minor redactions of company-internal details). I'm going to push his proposed fix in a minute too.

In short, the inbound bandwidth limitation on vHost user ports doesn't seem to work. The value set with OpenStack QoS commands on the port isn't configured properly. The problem exists in OVS backend.

Creating a 1 Mbit/s limit rule:

openstack network qos rule create max_1_Mbps --type bandwidth-limit --max-kbps 1000 --max-burst-kbits 1000 --ingress

After applying to the port, you can query it on the compute:

compute-0-5:/home/ceeinfra # ovs-vsctl list qos
_uuid : c326ed8b-24ef-4f1f-a5b0-b20f3ca3297d
external_ids : {id=vhu84edf6c2-f0}
other_config : {cbs="125000.0", cir="125000.0"}
queues : {}
type : egress-policer

Note: the traffic is ingress from the VM point of view, and egress from OVS.

The values are not integers, they have a .0 at the end. In /var/log/openvswitch/ovs-vswitchd.log you can see that it is not accepted:

2021-07-15T12:36:23.121Z|00208|netdev_dpdk|ERR|Could not create rte meter for egress policer
2021-07-15T12:36:23.121Z|00209|netdev_dpdk|ERR|Failed to set QoS type egress-policer on port vhu84edf6c2-f0: Invalid argument
2021-07-15T12:36:23.126Z|00210|netdev_dpdk|ERR|Could not create rte meter for egress policer
2021-07-15T12:36:23.126Z|00211|netdev_dpdk|ERR|Failed to set QoS type egress-policer on port vhu84edf6c2-f0: Invalid argument

If you create a traffic between two VMs, the downloading one having the limitation applied on its port reports this:

root@bwtest2:~# nc 192.168.1.201 5555 | dd of=/dev/null status=progress
816316928 bytes (816 MB, 779 MiB) copied, 5 s, 163 MB/s^C
1863705+71 records in
1863738+0 records out
954233856 bytes (954 MB, 910 MiB) copied, 8.23046 s, 116 MB/s

The bandwidth is higher than the set 1 Mb/s.

It is possible to modify the OVS agent so it applies the bandwidth limit correctly. You have to find out where the Python scripts of the neutron_openvswitch_agent container are stored on the compute host. In our environment the file to modify is:

/var/lib/docker/overlay2/68653008fca0a6434adb3985b021b2329680b71b49859c3a028f951deed59df3/merged/usr/lib/python3.6/site-packages/neutron/agent/common/ovs_lib.py

In the _update_ingress_bw_limit_for_dpdk_port function, the original code is:

        # cir and cbs should be set in bytes instead of bits
        qos_other_config = {
            'cir': str(max_bw_in_bits / 8),
            'cbs': str(max_burst_in_bits / 8)
        }

If you modify the code to this:

        # cir and cbs should be set in bytes instead of bits
        qos_other_config = {
            'cir': str(int(max_bw_in_bits / 8)),
            'cbs': str(int(max_burst_in_bits / 8))
        }

the values passed to OVS will be integers. You can see the difference querying the new values after applying the limit on the ports again:

compute-0-5:/home/ceeinfra # ovs-vsctl list qos
_uuid : b93b1165-e839-4378-a6b7-b75c13ad0d41
external_ids : {id=vhu84edf6c2-f0}
other_config : {cbs="125000", cir="125000"}
queues : {}
type : egress-policer

They don't have the .0 at the and anymore, and OVS doesn't complain in the logs about invalid arguments. The bandwidth limitation between the computes now works:

root@bwtest2:~# nc 192.168.1.201 5555 | dd of=/dev/null status=progress
4095488 bytes (4.1 MB, 3.9 MiB) copied, 33 s, 123 kB/s^C
7274+1382 records in
8051+0 records out
4122112 bytes (4.1 MB, 3.9 MiB) copied, 33.4033 s, 123 kB/s

125 kB/s translates to 1 Mb/s that we have applied with the rule, so it works now.

My guess is that this problem comes from the different behaviour between the division in Python2 and Python3:

user@debian:~$ python2
Python 2.7.16 (default, Oct 10 2019, 22:02:15)
[GCC 8.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 4/3
1
>>>
user@debian:~$ python3
Python 3.7.3 (default, Jan 22 2021, 20:04:44)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 4/3
1.3333333333333333
>>>

Python3 doesn't round to integers, and OVS doesn't seem to accept floating point numbers.

I have seen this in multiple versions.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/801310

Changed in neutron:
status: New → In Progress
Changed in neutron:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/801310
Committed: https://opendev.org/openstack/neutron/commit/8261b67b6e6248c5cc1a4426544188e5bd9bfa03
Submitter: "Zuul (22348)"
Branch: master

commit 8261b67b6e6248c5cc1a4426544188e5bd9bfa03
Author: Bence Romsics <email address hidden>
Date: Mon Jul 19 15:47:51 2021 +0200

    bw-limit: Pass int parameters to Open vSwitch

    Make sure we pass integer values to ovs when configuring bandwidth
    limit. This was likely working properly with Python2, and we may have
    missed this when migrating to Python3:

    https://www.python.org/dev/peps/pep-0238/

    Change-Id: I2f8d974d6644657aea95302d94ca0095d70a7e62
    Closes-Bug: #1936839
    Co-Authored-By: Tamás Trásy <email address hidden>

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 19.0.0.0rc1

This issue was fixed in the openstack/neutron 19.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/820612

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/820612
Committed: https://opendev.org/openstack/neutron/commit/38551777e027847e67e00b0bbcc920be98ab5d57
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 38551777e027847e67e00b0bbcc920be98ab5d57
Author: Bence Romsics <email address hidden>
Date: Mon Jul 19 15:47:51 2021 +0200

    bw-limit: Pass int parameters to Open vSwitch

    Make sure we pass integer values to ovs when configuring bandwidth
    limit. This was likely working properly with Python2, and we may have
    missed this when migrating to Python3:

    https://www.python.org/dev/peps/pep-0238/

    Change-Id: I2f8d974d6644657aea95302d94ca0095d70a7e62
    Closes-Bug: #1936839
    Co-Authored-By: Tamás Trásy <email address hidden>
    (cherry picked from commit 8261b67b6e6248c5cc1a4426544188e5bd9bfa03)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/821895

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/821895
Committed: https://opendev.org/openstack/neutron/commit/82e2a3d3620feda0804bdf1af42c977f3d981435
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 82e2a3d3620feda0804bdf1af42c977f3d981435
Author: Bence Romsics <email address hidden>
Date: Mon Jul 19 15:47:51 2021 +0200

    bw-limit: Pass int parameters to Open vSwitch

    Make sure we pass integer values to ovs when configuring bandwidth
    limit. This was likely working properly with Python2, and we may have
    missed this when migrating to Python3:

    https://www.python.org/dev/peps/pep-0238/

    Change-Id: I2f8d974d6644657aea95302d94ca0095d70a7e62
    Closes-Bug: #1936839
    Co-Authored-By: Tamás Trásy <email address hidden>
    (cherry picked from commit 8261b67b6e6248c5cc1a4426544188e5bd9bfa03)
    (cherry picked from commit 38551777e027847e67e00b0bbcc920be98ab5d57)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/neutron/+/822413

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/neutron/+/822934

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/822413
Committed: https://opendev.org/openstack/neutron/commit/7687cf128030539f054b5719643f12dbf152e3b2
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 7687cf128030539f054b5719643f12dbf152e3b2
Author: Bence Romsics <email address hidden>
Date: Mon Jul 19 15:47:51 2021 +0200

    bw-limit: Pass int parameters to Open vSwitch

    Make sure we pass integer values to ovs when configuring bandwidth
    limit. This was likely working properly with Python2, and we may have
    missed this when migrating to Python3:

    https://www.python.org/dev/peps/pep-0238/

    Change-Id: I2f8d974d6644657aea95302d94ca0095d70a7e62
    Closes-Bug: #1936839
    Co-Authored-By: Tamás Trásy <email address hidden>
    (cherry picked from commit 8261b67b6e6248c5cc1a4426544188e5bd9bfa03)
    (cherry picked from commit 38551777e027847e67e00b0bbcc920be98ab5d57)
    (cherry picked from commit 82e2a3d3620feda0804bdf1af42c977f3d981435)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/train)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/822934
Committed: https://opendev.org/openstack/neutron/commit/3d77a0596e28ee2bd3176279662331e44eb713cb
Submitter: "Zuul (22348)"
Branch: stable/train

commit 3d77a0596e28ee2bd3176279662331e44eb713cb
Author: Bence Romsics <email address hidden>
Date: Mon Jul 19 15:47:51 2021 +0200

    bw-limit: Pass int parameters to Open vSwitch

    Make sure we pass integer values to ovs when configuring bandwidth
    limit. This was likely working properly with Python2, and we may have
    missed this when migrating to Python3:

    https://www.python.org/dev/peps/pep-0238/

    Change-Id: I2f8d974d6644657aea95302d94ca0095d70a7e62
    Closes-Bug: #1936839
    Co-Authored-By: Tamás Trásy <email address hidden>
    (cherry picked from commit 8261b67b6e6248c5cc1a4426544188e5bd9bfa03)
    (cherry picked from commit 38551777e027847e67e00b0bbcc920be98ab5d57)
    (cherry picked from commit 82e2a3d3620feda0804bdf1af42c977f3d981435)
    (cherry picked from commit 7687cf128030539f054b5719643f12dbf152e3b2)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 17.3.0

This issue was fixed in the openstack/neutron 17.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 18.2.0

This issue was fixed in the openstack/neutron 18.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron train-eol

This issue was fixed in the openstack/neutron train-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron ussuri-eol

This issue was fixed in the openstack/neutron ussuri-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.