Race condition in metering agent when creating iptable managers for router namespaces
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Fix Released
|
Low
|
Alexandru Sorodoc |
Bug Description
Sometimes the metering agent fails to send meter information. When this happens and I ssh into the network node and run `ip netns qrouter-<router-id> iptables -nvL` I see no metering rules in the output. Restarting the metering agent fixes this.
I suspect that this is a race condition which happens when the metering agent is notified of a router before the L3 agent creates the namespaces for it. This causes the metering agent to not create an IptablesManager for the qrouter namespace and not add the metering rules (this happens in `neutron/
I tested this in the following manner:
1. Have a public router and a metering label with at least one rule attached. In my case I have two public routers (one distributed and the other centralized), a metering label with the ingress rule 0.0.0.0/0 and another metering label with the egress rule 0.0.0.0/0.
2. Have a single network node. This makes the test easier to control.
3. On the network node manually edit iptables_driver.py and add the following 2 if statements at the end of `RouterWithMete
if not self.iptables_
LOG.
if not self.snat_
LOG.
4. Set debug=True in /etc/neutron/
5. Reboot the network node.
6. After it boots up check the iptables on the qrouter namespace and if the metering rules are missing run `grep 'iptables manager' /var/log/
2018-12-06 07:55:26.103 1632 DEBUG neutron.
2018-12-06 07:55:26.104 1632 DEBUG neutron.
2018-12-06 07:55:26.158 1632 DEBUG neutron.
2018-12-06 07:55:26.159 1632 DEBUG neutron.
This confirms that at the time the metering agent started the router namespaces didn't exist so the metering rules weren't applied.
We are running a multi-node OpenStack Pike deployment running on CentOS 7.4.1708 servers.
The network node is running Neutron L3, DHC, VPN, Metering, Metadata, Open vSwitch agents and other Open vSwitch services.
tags: | added: metering |
tags: | added: l3-dvr-backlog |
Changed in neutron: | |
status: | New → Confirmed |
importance: | Undecided → Low |
I included a fix for this in my change to fix metering for DVR routers: https:/ /review. openstack. org/#/c/ 621165/