L3 agent notify error after HA router deletion

Bug #1480042 reported by LIU Yulong
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
Expired
Medium
Unassigned

Bug Description

$ uname -a
Linux compute02 3.10.0-229.el7.x86_64 #1 SMP Fri Mar 6 11:36:42 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

$ neutron-server --version
neutron-server 2015.1.0

Reproduce the bug:
1. Two network node running l3, dhcp, metadata
2. ml2 ovs + gre
3. enable neutron l3 ha
4. create ha router
5. set router gateway
6. add router interface to a subnet
7. create an instance
8. set a floating IP to the instance
9. unset and release the floating IP
10. unset the router gateway and the interface to a subnet
11. delete the router

Assert that the master vRouter is running in host A,
the second network node host B which is the stand-by vrrp qrouter-ns, it will get errors:

l3-agent.log

2015-07-31 12:10:01.886 7017 ERROR neutron.agent.linux.utils [-]
Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-0678efdd-e6e7-4f4e-be5c-ac975929d2c1', 'iptables-save', '-c']
Exit code: 1
Stdin:
Stdout:
Stderr: Cannot open network namespace "qrouter-0678efdd-e6e7-4f4e-be5c-ac975929d2c1": No such file or directory

2015-07-31 12:10:01.887 7017 DEBUG oslo_concurrency.lockutils [-] Releasing file lock "/var/lib/neutron/lock/neutron-iptables-qrouter-0678efdd-e6e7-4f4e-be5c-ac975929d2c1" after holding it for 0.074s release /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:227
2015-07-31 12:10:01.887 7017 DEBUG oslo_concurrency.lockutils [-] Releasing semaphore "iptables-qrouter-0678efdd-e6e7-4f4e-be5c-ac975929d2c1" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:404
2015-07-31 12:10:01.887 7017 DEBUG neutron.agent.linux.iptables_manager [-] Semaphore / lock released "iptables-qrouter-0678efdd-e6e7-4f4e-be5c-ac975929d2c1" _apply /usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_manager.py:419
2015-07-31 12:10:01.888 7017 ERROR neutron.callbacks.manager [-] Error during notification for neutron.agent.metadata.driver.before_router_removed router, before_delete
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager Traceback (most recent call last):
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager File "/usr/lib/python2.7/site-packages/neutron/callbacks/manager.py", line 143, in _notify_loop
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager callback(resource, event, trigger, **kwargs)
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager File "/usr/lib/python2.7/site-packages/neutron/agent/metadata/driver.py", line 171, in before_router_removed
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager router.iptables_manager.apply()
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager File "/usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_manager.py", line 407, in apply
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager self._apply()
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager File "/usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_manager.py", line 417, in _apply
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager return self._apply_synchronized()
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager File "/usr/lib/python2.7/site-packages/neutron/agent/linux/iptables_manager.py", line 437, in _apply_synchronized
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager all_tables = self.execute(args, run_as_root=True)
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 137, in execute
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager raise RuntimeError(m)
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager RuntimeError:
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-0678efdd-e6e7-4f4e-be5c-ac975929d2c1', 'iptables-save', '-c']
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager Exit code: 1
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager Stdin:
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager Stdout:
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager Stderr: Cannot open network namespace "qrouter-0678efdd-e6e7-4f4e-be5c-ac975929d2c1": No such file or directory
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager
2015-07-31 12:10:01.888 7017 TRACE neutron.callbacks.manager
2015-07-31 12:10:01.889 7017 INFO neutron.callbacks.manager [-] Notify callbacks for router, abort_delete

This message:
"Error during notification for neutron.agent.metadata.driver.before_router_removed router, before_delete"
came from the code:

class MetadataDriver(object):

    def __init__(self, l3_agent):
        self.metadata_port = l3_agent.conf.metadata_port
        self.metadata_access_mark = l3_agent.conf.metadata_access_mark
        registry.subscribe(
            after_router_added, resources.ROUTER, events.AFTER_CREATE)
        registry.subscribe(
            before_router_removed, resources.ROUTER, events.BEFORE_DELETE)

The router resources was deleted, but it seems that the Metadata was not notified.

Tags: l3-ha
LIU Yulong (dragon889)
summary: - l3 notify error afger ha router delete
+ l3 notify error after ha router delete
summary: - l3 notify error after ha router delete
+ L3 agent notify error after HA router deletion
LIU Yulong (dragon889)
description: updated
description: updated
LIU Yulong (dragon889)
description: updated
Matt Fischer (mfisch)
Changed in neutron:
status: New → Confirmed
Assaf Muller (amuller)
tags: added: l3-ha
Changed in neutron:
importance: Undecided → Medium
Revision history for this message
Assaf Muller (amuller) wrote :

Sorry I couldn't reproduce on master. I checked and the relevant code didn't seem to change much from Kilo to Liberty release or current master. I also switched to stable/kilo and ran a functional test that created and immediately deleted an HA router (While it's still standby / not master) and I didn't see any traces.

Changed in neutron:
status: Confirmed → Incomplete
Revision history for this message
Manjeet Singh Bhatia (manjeet-s-bhatia) wrote :

i tried this on a two node devstack. I was able to delete router without getting error.
i don't think it is a valid bug.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.