L3 agent failure to setup floating IPs

Bug #1640202 reported by Nicolas Vila
24
This bug affects 5 people
Affects Status Importance Assigned to Milestone
neutron
Won't Fix
Undecided
Unassigned

Bug Description

Hello,

We have a Mitaka deploy with several projects running, with 3 controller nodes and 4 compute nodes. After restarting the L3 agent, for one specific tenant the floating IPs that are attached to an instance are not actually attached, L3 agent is throwing the following error:

2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info [-] L3 agent failure to setup floating IPs
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info Traceback (most recent call last):
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py", line 342, in configure_fip_addresses
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info return self.process_floating_ip_addresses(interface_name)
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py", line 314, in process_floating_ip_addresses
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info fip, interface_name, device)
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/l3/dvr_local_router.py", line 157, in add_floating_ip
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info self.floating_ip_added_dist(fip, ip_cidr)
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/l3/dvr_local_router.py", line 88, in floating_ip_added_dist
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info device.route.add_route(fip_cidr, str(rtr_2_fip.ip))
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 841, in add_route
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info self._run_as_root_detect_device_not_found([ip_version], tuple(args))
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 699, in _run_as_root_detect_device_not_found
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info raise exceptions.DeviceNotFoundError(device_name=self.name)
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info self.force_reraise()
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info six.reraise(self.type_, self.value, self.tb)
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 694, in _run_as_root_detect_device_not_found
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info return self._as_root(*args, **kwargs)
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 361, in _as_root
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info use_root_namespace=use_root_namespace)
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 94, in _as_root
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info log_fail_as_error=self.log_fail_as_error)
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 103, in _execute
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info log_fail_as_error=log_fail_as_error)
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py", line 140, in execute
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info raise RuntimeError(msg)
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info RuntimeError: Exit code: 2; Stdin: ; Stdout: ; Stderr: RTNETLINK answers: Network is unreachable
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info
2016-11-08 08:56:02.061 12466 ERROR neutron.agent.l3.router_info
2016-11-08 08:56:02.063 12466 ERROR neutron.agent.l3.router_info [-] Failed to process floating IPs.
2016-11-08 08:56:02.063 12466 ERROR neutron.agent.l3.router_info Traceback (most recent call last):
2016-11-08 08:56:02.063 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py", line 815, in process_external
2016-11-08 08:56:02.063 12466 ERROR neutron.agent.l3.router_info fip_statuses = self.configure_fip_addresses(interface_name)
2016-11-08 08:56:02.063 12466 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py", line 347, in configure_fip_addresses
2016-11-08 08:56:02.063 12466 ERROR neutron.agent.l3.router_info raise n_exc.FloatingIpSetupException(msg)
2016-11-08 08:56:02.063 12466 ERROR neutron.agent.l3.router_info FloatingIpSetupException: L3 agent failure to setup floating IPs
2016-11-08 08:56:02.063 12466 ERROR neutron.agent.l3.router_info

As of now, every time we try to attach/detach the floating IP on an instance the L3 agent shows the same error. Neutron package versions are:

ii neutron-common 2:8.2.0-0ubuntu1~cloud0
ii neutron-dhcp-agent 2:8.2.0-0ubuntu1~cloud0
ii neutron-l3-agent 2:8.2.0-0ubuntu1~cloud0
ii neutron-metadata-agent 2:8.2.0-0ubuntu1~cloud0
ii neutron-openvswitch-agent 2:8.2.0-0ubuntu1~cloud0
ii neutron-plugin-ml2 2:8.2.0-0ubuntu1~cloud0

OS: Ubuntu 14.04, uname: Linux compute01 3.16.0-77-generic #99~14.04.1-Ubuntu SMP Tue Jun 28 19:17:10 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Tags: l3-ipam-dhcp
Revision history for this message
Nicolas Vila (nvlan) wrote :

To follow up with a little more detail, we have two external networks. There are working floating IPs on the project, the one thing that is not happening is on the FIP namespace, when I attach a floating IP the route that sends traffic for that floating IP to the corresponding fpr interface is not being created.

In the following paste: http://paste.openstack.org/show/588391/ I'm showing the FIP namespace with its fpr interfaces and the existing routes for the currently working floating IPs, as well as the rfp interface on the router. All the existing stuff before restarting L3 agent works, but the new route is not being created (if we add the route by hand on the namespace, traffic DOES work).

Revision history for this message
Brian Haley (brian-haley) wrote :

So I see you're using the Ubuntu cloud packages. Is it possible to upgrade to the latest ones? I believe stable/mitaka is at 8.3.0 and we might have already fixed this bug.

I am curious what device it thinks is missing, the "device not found" error seems to imply one of the fpr devices is missing but it's in your paste. You might have to enable debug=True so more info is printed, which might also include the command being run.

Revision history for this message
Nicolas Vila (nvlan) wrote :

Hello Brian,

As per https://wiki.ubuntu.com/OpenStack/CloudArchive, adding mitaka rep with "add-apt-repository cloud-archive:mitaka" allows to install 8.2.0, not 8.3.0... Am I doing something wrong? As for the debug logs, I'll collect them and upload in a bit. The command I'm executing is simply a 'neutron floatingip-associate'

Thanks, regards.

Revision history for this message
Nicolas Vila (nvlan) wrote :

Hello Brian,

Here's the L3 log with debug=True: http://paste.openstack.org/show/588427/
The floating IP I'm trying to attach is 177.54.158.167, and the command that is failing is the following: 'sudo ip netns exec fip-4e4c1d8d-9030-4239-9361-1736b259bb2c ip -4 route replace 177.54.158.167/32 via 169.254.109.46 dev fpr-94e0d76c-3', but in the fpr-94e0d76c-3 interface the network is different:

root@compute01:~# ifconfig fpr-94e0d76c-3
fpr-94e0d76c-3 Link encap:Ethernet HWaddr 42:ea:8b:6e:96:46
          inet addr:169.254.122.147 Bcast:0.0.0.0 Mask:255.255.255.254
          inet6 addr: fe80::40ea:8bff:fe6e:9646/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:103092 errors:0 dropped:0 overruns:0 frame:0
          TX packets:112038 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:16390657 (16.3 MB) TX bytes:31697960 (31.6 MB)

That network 169.254.109.46/31 is on the other FIP namespace:

root@compute01:~# ip net exec fip-60dade73-f046-434b-8232-9c5f430333a7 ifconfig fpr-38efd0e0-2
fpr-38efd0e0-2 Link encap:Ethernet HWaddr 6e:09:b4:3d:bc:db
          inet addr:169.254.109.47 Bcast:0.0.0.0 Mask:255.255.255.254
          inet6 addr: fe80::6c09:b4ff:fe3d:bcdb/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:16885711 errors:0 dropped:0 overruns:0 frame:0
          TX packets:17075289 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:22248001183 (22.2 GB) TX bytes:22412370525 (22.4 GB)

Thanks, regards.

tags: added: l3-ipam-dhcp
Revision history for this message
Brian Haley (brian-haley) wrote :

Hello Nicolas,

Thanks for the info about the multiple external networks, that is most likely the problem. As a workaround can you put both subnets on the same network in your configuration? Just trying to get things working while we investigate further.

Revision history for this message
Nicolas Vila (nvlan) wrote :

Hello Brian,

I'm afraid that won't be possible, this issue is happening on a productive customer that had this two-network setup working until we had to restart the l3-agent. Correct me if I'm mistaken, but changing the subnet now would require to remove all ports from it first (thus terminating several instances). I do make a note if we need to create a second external in a different deploy, to create both subnets on the same network.

Please let me know if I can assist with any testing.

Thanks, kind regards.

Changed in neutron:
status: New → Confirmed
Revision history for this message
Piotr Parczewski (pparczewski) wrote :

Hi,

I have exactly same traceback on slightly different environment: only 1 external network and cloud archive neutron 9.0.0-0ubuntu1~cloud0 packages. Floating IP is going into ERROR status after associating it with lbaas VIP port. The same IP works when attached to an instance.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Bug closed due to lack of activity, please feel free to reopen if needed.

Changed in neutron:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.