Race condition in metering agent when creating iptable managers for router namespaces

Bug #1807153 reported by Alexandru Sorodoc on 2018-12-06
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Low
Alexandru Sorodoc

Bug Description

Sometimes the metering agent fails to send meter information. When this happens and I ssh into the network node and run `ip netns qrouter-<router-id> iptables -nvL` I see no metering rules in the output. Restarting the metering agent fixes this.

I suspect that this is a race condition which happens when the metering agent is notified of a router before the L3 agent creates the namespaces for it. This causes the metering agent to not create an IptablesManager for the qrouter namespace and not add the metering rules (this happens in `neutron/services/metering/drivers/iptables/iptables_driver.py` in `RouterWithMetering.__init__`).

I tested this in the following manner:
1. Have a public router and a metering label with at least one rule attached. In my case I have two public routers (one distributed and the other centralized), a metering label with the ingress rule 0.0.0.0/0 and another metering label with the egress rule 0.0.0.0/0.
2. Have a single network node. This makes the test easier to control.
3. On the network node manually edit iptables_driver.py and add the following 2 if statements at the end of `RouterWithMetering.__init__`:

if not self.iptables_manager:
    LOG.debug('Router %s has no iptables manager', router['name'])

if not self.snat_iptables_manager:
    LOG.debug('Router %s has no snat iptables manager', router['name'])

4. Set debug=True in /etc/neutron/metering_agent.ini on the network node.
5. Reboot the network node.
6. After it boots up check the iptables on the qrouter namespace and if the metering rules are missing run `grep 'iptables manager' /var/log/meutron/metering-agent.log. I get the following output:

2018-12-06 07:55:26.103 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router1 has no iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:104
2018-12-06 07:55:26.104 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router1 has no snat iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:107
2018-12-06 07:55:26.158 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router2 has no iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:104
2018-12-06 07:55:26.159 1632 DEBUG neutron.services.metering.drivers.iptables.iptables_driver [req-87aa420f-8bdb-48b8-be38-91e407a56a61 - - - - -] Router public_router2 has no snat iptables manager __init__ /usr/lib/python2.7/site-packages/neutron/services/metering/drivers/iptables/iptables_driver.py:107

This confirms that at the time the metering agent started the router namespaces didn't exist so the metering rules weren't applied.

We are running a multi-node OpenStack Pike deployment running on CentOS 7.4.1708 servers.
The network node is running Neutron L3, DHC, VPN, Metering, Metadata, Open vSwitch agents and other Open vSwitch services.

Alexandru Sorodoc (bno1) on 2018-12-06
tags: added: metering
Alexandru Sorodoc (bno1) wrote :

I included a fix for this in my change to fix metering for DVR routers: https://review.openstack.org/#/c/621165/

tags: added: l3-dvr-backlog
Changed in neutron:
status: New → Confirmed
importance: Undecided → Low

Fix proposed to branch: master
Review: https://review.opendev.org/666970

Changed in neutron:
assignee: nobody → Alexandru Sorodoc (bno1)
status: Confirmed → In Progress

Reviewed: https://review.opendev.org/666970
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9f541521bbbcf36325bfc250e3ce27a138ddef3c
Submitter: Zuul
Branch: master

commit 9f541521bbbcf36325bfc250e3ce27a138ddef3c
Author: bno1 <email address hidden>
Date: Sun Jun 23 00:51:02 2019 +0300

    Retry creating iptables managers and adding metering rules

    This change makes the metering agent retry creating the iptables
    managers for each router and applying the metering rules.
    This is needed in case the metering agent starts before some or all of
    the namespaces are created.

    Change-Id: Ifc565feb98c7f02df5c2831a3607c3e526a2e703
    Closes-Bug: #1807153

Changed in neutron:
status: In Progress → Fix Released

This issue was fixed in the openstack/neutron 15.0.0.0b1 development milestone.

Reviewed: https://review.opendev.org/693183
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=ff07bacc9080ab5d904f4b82f5d16da4fd61df6e
Submitter: Zuul
Branch: stable/stein

commit ff07bacc9080ab5d904f4b82f5d16da4fd61df6e
Author: bno1 <email address hidden>
Date: Sun Jun 23 00:51:02 2019 +0300

    Retry creating iptables managers and adding metering rules

    This change makes the metering agent retry creating the iptables
    managers for each router and applying the metering rules.
    This is needed in case the metering agent starts before some or all of
    the namespaces are created.

    Change-Id: Ifc565feb98c7f02df5c2831a3607c3e526a2e703
    Closes-Bug: #1807153
    (cherry picked from commit 9f541521bbbcf36325bfc250e3ce27a138ddef3c)

tags: added: in-stable-stein

Reviewed: https://review.opendev.org/692910
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=689c1125efcf586daea8c2bbe0c8913f465d22c8
Submitter: Zuul
Branch: stable/rocky

commit 689c1125efcf586daea8c2bbe0c8913f465d22c8
Author: bno1 <email address hidden>
Date: Sun Jun 23 00:51:02 2019 +0300

    Retry creating iptables managers and adding metering rules

    This change makes the metering agent retry creating the iptables
    managers for each router and applying the metering rules.
    This is needed in case the metering agent starts before some or all of
    the namespaces are created.

    Change-Id: Ifc565feb98c7f02df5c2831a3607c3e526a2e703
    Closes-Bug: #1807153
    (cherry picked from commit 9f541521bbbcf36325bfc250e3ce27a138ddef3c)

tags: added: in-stable-rocky

This issue was fixed in the openstack/neutron 13.0.6 release.

This issue was fixed in the openstack/neutron 14.0.4 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers