Tempest floatingip scenario tests failing on DVR Multinode setup with HA

Bug #1717302 reported by Swaminathan Vasudevan
32
This bug affects 4 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Slawek Kaplonski

Bug Description

neutron.tests.tempest.scenario.test_floatingip.FloatingIpSameNetwork and
neutron.tests.tempest.scenario.test_floatingip.FloatingIpSeparateNetwork are failing on every patch.

This trace is seen on the node-2 l3-agent.

Sep 13 07:16:43.404250 ubuntu-xenial-2-node-rax-dfw-10909819-895688 neutron-keepalived-state-change[5461]: ERROR neutron.agent.linux.ip_lib [-] Failed sending gratuitous ARP to 172.24.5.3 on qg-bf79c157-e2 in namespace qrouter-796b8715-ca01-43ad-bc08-f81a0b4db8cc: Exit code: 2; Stdin: ; Stdout: ; Stderr: bind: Cannot assign requested address
                                                                                                           : ProcessExecutionError: Exit code: 2; Stdin: ; Stdout: ; Stderr: bind: Cannot assign requested address
                                                                                                           ERROR neutron.agent.linux.ip_lib Traceback (most recent call last):
                                                                                                           ERROR neutron.agent.linux.ip_lib File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 1082, in _arping
                                                                                                           ERROR neutron.agent.linux.ip_lib ip_wrapper.netns.execute(arping_cmd, extra_ok_codes=[1])
                                                                                                           ERROR neutron.agent.linux.ip_lib File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 901, in execute
                                                                                                           ERROR neutron.agent.linux.ip_lib log_fail_as_error=log_fail_as_error, **kwargs)
                                                                                                           ERROR neutron.agent.linux.ip_lib File "/opt/stack/new/neutron/neutron/agent/linux/utils.py", line 151, in execute
                                                                                                           ERROR neutron.agent.linux.ip_lib raise ProcessExecutionError(msg, returncode=returncode)
                                                                                                           ERROR neutron.agent.linux.ip_lib ProcessExecutionError: Exit code: 2; Stdin: ; Stdout: ; Stderr: bind: Cannot assign requested address
                                                                                                           ERROR neutron.agent.linux.ip_lib
                                                                                                           ERROR neutron.agent.linux.ip_lib

If this is a DVR router, then the GARP should not go through the qg interface for the floatingIP.

More information can be seen here.

http://logs.openstack.org/43/500143/5/check/gate-tempest-dsvm-neutron-dvr-multinode-scenario-ubuntu-xenial-nv/0a58fce/logs/subnode-2/screen-q-l3.txt.gz?level=TRACE#_Sep_13_07_16_47_864052

summary: - Tempest floatingip scenario tests failing on DVR Multinode setup
+ Tempest floatingip scenario tests failing on DVR Multinode setup with HA
Revision history for this message
Swaminathan Vasudevan (swaminathan-vasudevan) wrote :
Revision history for this message
Swaminathan Vasudevan (swaminathan-vasudevan) wrote :

http://logs.openstack.org/30/503530/6/check/gate-tempest-dsvm-neutron-dvr-multinode-scenario-ubuntu-xenial-nv/ba5131c/logs/subnode-2/screen-q-l3.txt.gz?level=DEBUG#_Sep_14_18_31_35_136440

This is also this trace seen in the debug logs on Node2.

Sep 14 20:26:41.262749 ubuntu-xenial-2-node-rax-dfw-10937231-899019 neutron-l3-agent[8011]: Traceback (most recent call last):
Sep 14 20:26:41.262909 ubuntu-xenial-2-node-rax-dfw-10937231-899019 neutron-l3-agent[8011]: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
Sep 14 20:26:41.263056 ubuntu-xenial-2-node-rax-dfw-10937231-899019 neutron-l3-agent[8011]: timer()
Sep 14 20:26:41.263191 ubuntu-xenial-2-node-rax-dfw-10937231-899019 neutron-l3-agent[8011]: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
Sep 14 20:26:41.263333 ubuntu-xenial-2-node-rax-dfw-10937231-899019 neutron-l3-agent[8011]: cb(*args, **kw)
Sep 14 20:26:41.263468 ubuntu-xenial-2-node-rax-dfw-10937231-899019 neutron-l3-agent[8011]: File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 1124, in arping
Sep 14 20:26:41.263599 ubuntu-xenial-2-node-rax-dfw-10937231-899019 neutron-l3-agent[8011]: _arping(ns_name, iface_name, address, count, log_exception)
Sep 14 20:26:41.263731 ubuntu-xenial-2-node-rax-dfw-10937231-899019 neutron-l3-agent[8011]: File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 1060, in _arping
Sep 14 20:26:41.263870 ubuntu-xenial-2-node-rax-dfw-10937231-899019 neutron-l3-agent[8011]: for i in range(count):
Sep 14 20:26:41.264015 ubuntu-xenial-2-node-rax-dfw-10937231-899019 neutron-l3-agent[8011]: TypeError: range() integer end argument expected, got ConfigOpts.

Revision history for this message
Swaminathan Vasudevan (swaminathan-vasudevan) wrote :

Sep 13 07:16:43.404250 ubuntu-xenial-2-node-rax-dfw-10909819-895688 neutron-keepalived-state-change[5461]: ERROR neutron.agent.linux.ip_lib [-] Failed sending gratuitous ARP to 172.24.5.3 on qg-bf79c157-e2 in namespace qrouter-796b8715-ca01-43ad-bc08-f81a0b4db8cc: Exit code: 2; Stdin: ; Stdout: ; Stderr: bind: Cannot assign requested address

The above trace is seen after a DVR Router is migrated to a Legacy-HA router. ( That is my understanding) May be @Anilvenkata can comment on this.

Revision history for this message
Swaminathan Vasudevan (swaminathan-vasudevan) wrote :

@anil-venkata can you comment on this bug.
I am trying to understand this scenario test case.
We are basically using two node setup, with one DVR node and the other DVR_SNAT node.
With DVR, ha should not be enabled, since we only have one node.
If it can be enabled, then how are we testing the master/slave snat_namespace here.

Aso the issue I am seeing consistently is 'ip address 172.24.5.4/32 dev qg-c4da18e0-db, no longer exist".

And this is trying to send the ARP mesage 'Failed sending gratuitous ARP to 172.24.5.4 on qg-c4da18e0-db in namespace qrouter-617bfb93-834e-4cf8-9f9d-521279f4f580'

DVR routers do not create qg- interface in qrouter namespace.

Changed in neutron:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Brian Haley (brian-haley) wrote :

I think I have a patch for this bug, maybe I didn't tag it correctly with the BZ #.

Revision history for this message
Brian Haley (brian-haley) wrote :

https://bugs.launchpad.net/neutron/+bug/1696893 is the bug I've been tracking the arping fix under, mostly just a cosmetic error since it's asynchronous.

Revision history for this message
Brian Haley (brian-haley) wrote :

Even with the above patches (bad arping arguments, arping error), we still have a failure:

http://logs.openstack.org/84/500384/18/check/gate-tempest-dsvm-neutron-dvr-multinode-scenario-ubuntu-xenial-nv/a427ec7/logs/testr_results.html.gz

Traceback (most recent call last):
  File "/opt/stack/new/neutron/neutron/tests/tempest/scenario/test_floatingip.py", line 139, in test_east_west
    self._test_east_west()
  File "/opt/stack/new/neutron/neutron/tests/tempest/scenario/test_floatingip.py", line 119, in _test_east_west
    dest_server['port']['fixed_ips'][0]['ip_address'])
  File "/opt/stack/new/neutron/neutron/tests/tempest/scenario/base.py", line 279, in check_remote_connectivity
    source, dest, should_succeed, nic, mtu, fragmentation))
  File "/opt/stack/new/neutron/neutron/tests/tempest/scenario/base.py", line 274, in _check_remote_connectivity
    1)
  File "tempest/lib/common/utils/test_utils.py", line 103, in call_until_true
    if func():
  File "/opt/stack/new/neutron/neutron/tests/tempest/scenario/base.py", line 259, in ping_remote
    fragmentation=fragmentation)
  File "/opt/stack/new/neutron/neutron/tests/tempest/scenario/base.py", line 254, in ping_host
    return source.exec_command(cmd)
  File "tempest/lib/common/ssh.py", line 151, in exec_command
    ssh = self._get_ssh_connection()
  File "tempest/lib/common/ssh.py", line 121, in _get_ssh_connection
    password=self.password)
tempest.lib.exceptions.SSHTimeout: Connection to the 172.24.5.6 via SSH timed out.
User: ubuntu, Password: None

Still need to track this down.

tags: added: gate-failure
Revision history for this message
Swaminathan Vasudevan (swaminathan-vasudevan) wrote :
Download full text (9.2 KiB)

I was able to reproduce this issue locally.

These tests are failing randomly and on further debugging here is what I could see.
In the two node setup.

In Node 1 (Ubuntu-controller) there is one 'VM'
In the Node 2(Ubuntu-compute-new) there are two 'VMs'

Both the VMs in Node2 have floatingIP configured.
Here is the output of the 'router-namespace' iptable rules.

stack@ubuntu-compute-new:~/devstack$ sudo ip netns exec qrouter-6f01678c-64d6-4197-b09d-3285c46207ef bash
root@ubuntu-compute-new:~/devstack# iptables -t nat -S
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N neutron-l3-agent-OUTPUT
-N neutron-l3-agent-POSTROUTING
-N neutron-l3-agent-PREROUTING
-N neutron-l3-agent-float-snat
-N neutron-l3-agent-snat
-N neutron-postrouting-bottom
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-POSTROUTING ! -i rfp-6f01678c-6 ! -o rfp-6f01678c-6 -m conntrack ! --ctstate DNAT -j ACCEPT
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
-A neutron-l3-agent-PREROUTING -d 192.168.100.100/32 -i rfp-6f01678c-6 -j DNAT --to-destination 10.0.0.13
-A neutron-l3-agent-PREROUTING -d 192.168.100.114/32 -i rfp-6f01678c-6 -j DNAT --to-destination 10.0.0.14
-A neutron-l3-agent-float-snat -s 10.0.0.13/32 -j SNAT --to-source 192.168.100.100
-A neutron-l3-agent-float-snat -s 10.0.0.14/32 -j SNAT --to-source 192.168.100.114
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat
root@ubuntu-compute-new:~/devstack#

But What I see in the 'Fip namespace' is that the "10.0.0.13" IP is seen within the Fipnamespace responding to a FloatingIP.

stack@ubuntu-compute-new:~$ sudo ip netns exec fip-5c94b420-0b1f-4025-864a-9209d8e7211f tcpdump -i any icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
 ^C19:50:32.073635 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 54785, seq 0, length 64
19:50:35.578246 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 55553, seq 0, length 64
19:50:39.153168 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 56321, seq 0, length 64
19:50:42.790410 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 57089, seq 0, length 64
19:50:46.368505 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 57857, seq 0, length 64
19:50:49.982396 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 58625, seq 0, length 64
19:50:53.553890 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 59393, seq 0, length 64
19:50:57.005240 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 60161, seq 0, length 64
19:51:00.557693 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 60929, seq 0, length 64
19:51:04.045430 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 61697, seq 0, length 64
19:51:07.579294 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 62465, seq 0, length 64
19:51:11.229360 IP 10.0.0...

Read more...

Revision history for this message
Swaminathan Vasudevan (swaminathan-vasudevan) wrote :
Download full text (5.0 KiB)

Also on the 'Node1', floatingIP is configured but the DNAT rule is missing in the router namespace.

stack@ubuntu-controller:~/devstack$ neutron floatingip-list
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
+--------------------------------------+----------------------------------+------------------+---------------------+--------------------------------------+
| id | tenant_id | fixed_ip_address | floating_ip_address | port_id |
+--------------------------------------+----------------------------------+------------------+---------------------+--------------------------------------+
| 0fd57315-51d7-4277-9835-e3aee82a5773 | 948bc6fadbbc4ca4ad4d223dcc76b9f1 | 10.0.0.3 | 192.168.100.104 | 9187cca2-a96f-495f-abf4-041de154fc95 |
| 5ad5be80-f720-47b0-a05e-4b309d192daf | 948bc6fadbbc4ca4ad4d223dcc76b9f1 | 10.0.0.13 | 192.168.100.100 | 95e78c3c-21a2-4d62-9fc9-ad5451ef73cd |
| 6fc89fb9-ffc7-438d-8320-23c44de2ab09 | 948bc6fadbbc4ca4ad4d223dcc76b9f1 | 10.0.0.14 | 192.168.100.114 | e4b5e14e-6625-4bbb-884c-36f94dbc609d |
+--------------------------------------+----------------------------------+------------------+---------------------+--------------------------------------+

root@ubuntu-controller:~/devstack# iptables -t nat -S
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N neutron-l3-agent-OUTPUT
-N neutron-l3-agent-POSTROUTING
-N neutron-l3-agent-PREROUTING
-N neutron-l3-agent-float-snat
-N neutron-l3-agent-snat
-N neutron-postrouting-bottom
-A PREROUTING -j neutron-l3-agent-PREROUTING
-A OUTPUT -j neutron-l3-agent-OUTPUT
-A POSTROUTING -j neutron-l3-agent-POSTROUTING
-A POSTROUTING -j neutron-postrouting-bottom
-A neutron-l3-agent-POSTROUTING ! -i rfp-6f01678c-6 ! -o rfp-6f01678c-6 -m conntrack ! --ctstate DNAT -j ACCEPT
-A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
-A neutron-l3-agent-snat -j neutron-l3-agent-float-snat
-A neutron-postrouting-bottom -m comment --comment "Perform source NAT on outgoing traffic." -j neutron-l3-agent-snat

There are not "DNAT" rules seen in the router namespace.

But the IP rule shows 54170: from 10.0.0.3 lookup 16 is defined.

root@ubuntu-controller:~/devstack# ip rule
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
54170: from 10.0.0.3 lookup 16
167772161: from 10.0.0.1/28 lookup 167772161
root@ubuntu-controller:~/devstack#

The fipnamespace also has the routes required to route the traffic for the floatingip. (192.168.100.104).

stack@ubuntu-controller:~/devstack$ sudo ip netns exec fip-5c94b420-0b1f-4025-864a-9209d8e7211f bash
root@ubuntu-controller:~/devstack# ifconfig
fg-687a771e-78 Link encap:Ethernet HWaddr fa:16:3e:81:97:2e
          inet addr:192.168.100.105 Bcast:192.168.100.255 Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fe81:972e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:290 errors:0 dropped:0 overruns:0 frame:0
          TX packets:114 er...

Read more...

Revision history for this message
Swaminathan Vasudevan (swaminathan-vasudevan) wrote :

The 'odd' behavior here is

We do see that the DNAT rule is in place for the incoming packets.
-A neutron-l3-agent-PREROUTING -d 192.168.100.100/32 -i rfp-6f01678c-6 -j DNAT --to-destination 10.0.0.13

We do see that the float-Snat rule is in place for the outgoing packets.
-A neutron-l3-agent-float-snat -s 10.0.0.13/32 -j SNAT --to-source 192.168.100.100

But What I see in the 'Fip namespace' is that the "10.0.0.13" IP is seen within the Fipnamespace responding to a FloatingIP. ( Theoretically the above rule 'on neutron-l3-agent-float-snat' should have translated the source address 10.0.0.13 to 192.168.100.100. But it did not happen)?????????????

NOT SURE WHY?

stack@ubuntu-compute-new:~$ sudo ip netns exec fip-5c94b420-0b1f-4025-864a-9209d8e7211f tcpdump -i any icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
 ^C19:50:32.073635 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 54785, seq 0, length 64
19:50:35.578246 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 55553, seq 0, length 64
19:50:39.153168 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 56321, seq 0, length 64
19:50:42.790410 IP 10.0.0.13 > 192.168.100.109: ICMP echo reply, id 57089, seq 0, length 64

Revision history for this message
Swaminathan Vasudevan (swaminathan-vasudevan) wrote :

I am running out of ideas on this bug.
Can anyone else take a look at it.

Changed in neutron:
assignee: nobody → Brian Haley (brian-haley)
Miguel Lavalle (minsel)
Changed in neutron:
assignee: Brian Haley (brian-haley) → Miguel Lavalle (minsel)
Revision history for this message
LIU Yulong (dragon889) wrote :

I think this is the root cause:
https://bugs.centos.org/view.php?id=11238
It's a kernel bug.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/pike)

Reviewed: https://review.openstack.org/600197
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=dbd6dbb5e19a269c17601a4038eae8eb050d182c
Submitter: Zuul
Branch: stable/pike

commit dbd6dbb5e19a269c17601a4038eae8eb050d182c
Author: Jakub Libosvar <email address hidden>
Date: Tue Oct 24 13:11:14 2017 +0000

    tests: Add decorator to mark unstable tests

    As it was agreed on Neutron CI meeting, we're going to mark unstable
    tests in fullstack suite with this decorator while working in paralel on
    stabilization of such tests.

    Mark the DVR east-west tests as unstable to prove it works.

    Conflicts:
        neutron/tests/tempest/scenario/base.py

    NOTE: This is a squash of the unstable decorator change and another
          to the neutron-tempest-plugin repo during the Queens cycle.

    Related-bug: #1717302

    Change-Id: I3beb6e7a4d96da778378e9d979cb8c6261f6036b
    (cherry picked from commit bdda46ade7f1f8a2742bcba6ea7556e3f059031f)
    (cherry picked from commit ba80045aabbdf5bbf66e39ed5aecad72eb3d86ef)

tags: added: in-stable-pike
Revision history for this message
Miguel Lavalle (minsel) wrote :
Revision history for this message
Miguel Lavalle (minsel) wrote :

Note that in related bug https://bugs.launchpad.net/neutron/+bug/1793118, the submitter reports that:

1) Despite the error in the log file, data plane works correctly.
2) After executing sysctl -w net.ipv4.ip_nonlocal_bind=1 in the router name space, the error messages go away. I wonder if this patch is related to the issue: https://review.openstack.org/#/c/393886/

Revision history for this message
Miguel Lavalle (minsel) wrote :

I will bring this up in the next Le sub-team meeting

Revision history for this message
Gökhan (skylightcoder) wrote :

I think, ı find the problem. keepalived 1:1.2.24-1ubuntu0.16.04.1 is broken. if you downgrade it, it will work properly. I run sudo apt-get install --allow-downgrades keepalived=1:1.2.19-1 and it is worked. we need to check neutron side for keepalived 1:1.2.24-1ubuntu0.16.04.1

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Hi Gökhan,

Speaking about keepalived we already had problems with this version in functional tests, see https://bugs.launchpad.net/neutron/+bug/1788185

I even reported bug for keepalived https://bugs.launchpad.net/ubuntu/+source/keepalived/+bug/1789045 - maybe You can updated it with Your findings also?

I know that for functional tests newer version of keepalived was also working fine. Problem is only with this one specific version which You pointed also.

Revision history for this message
Gökhan (skylightcoder) wrote :

Hi Slawek,
thanks for your explanation. You are right. Problem is only with this specific keepalived version, but unfortunately this is stable version on ubuntu xenial. I will share my findings to https://bugs.launchpad.net/ubuntu/+source/keepalived/+bug/1789045

Changed in neutron:
assignee: Miguel Lavalle (minsel) → Slawek Kaplonski (slaweq)
Revision history for this message
Federico Ressi (fressi-redhat) wrote :

What about this bug? Is it safe to ignore floating IP tests results because of this bug? Shouldn't we skip these tests only in the configuration type we know this problem can happen?

Revision history for this message
Slawek Kaplonski (slaweq) wrote :
Revision history for this message
Slawek Kaplonski (slaweq) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-tempest-plugin (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/618557

Revision history for this message
Slawek Kaplonski (slaweq) wrote :
Download full text (6.2 KiB)

I don't think this issue is related to keepalived as keepalived is not used in scenario where those tests are failing.
I deployed similar env as in dvr-multinode scenario job used in gate and I run those tests.
It's very often that one of FIPs configured during test is not pinging. What I found there is that there is no routing to such FIP configured in fip-xxx namespace on host where it should be. Even after restart L3 agent it's not there so I think it's something wrong on server's side instead of some race on L3 agent.
Also when I detached and attached broken FIP to instance again, I saw in L3 agent's logs something like:

Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info [-] L3 agent failure to setup floating IPs: ProcessExecutionError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot open network namespace "snat-d623a0c3-feed-4d29-8644-5b992d301eca": No such file or directory
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info Traceback (most recent call last):
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info File "/opt/stack/neutron/neutron/agent/l3/router_info.py", line 406, in configure_fip_addresses
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info return self.process_floating_ip_addresses(interface_name)
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info File "/opt/stack/neutron/neutron/agent/l3/router_info.py", line 349, in process_floating_ip_addresses
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info centralized_fip_cidrs = self.get_centralized_fip_cidr_set()
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info File "/opt/stack/neutron/neutron/agent/l3/dvr_edge_router.py", line 279, in get_centralized_fip_cidr_set
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info return set([addr['cidr'] for addr in device.addr.list()])
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info File "/opt/stack/neutron/neutron/agent/linux/ip_lib.py", line 672, in list
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info self.name, scope, to, filters, ip_version)
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info File "/opt/stack/neutron/neutron/agent/linux/ip_lib.py", line 637, in get_devices_with_ip
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info for line in self._run(options, tuple(args)).split('\n'):
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info File "/opt/stack/neutron/neutron/agent/linux/ip_lib.py", line 372, in _run
Nov 16 22:19:20 devstack-ubuntu-allinone neutron-l3-agent[25687]: ERROR neutron.agent.l3.router_info return self._parent._run(options, self.COMMAND, args)...

Read more...

Revision history for this message
Slawek Kaplonski (slaweq) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/618750

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/618750
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7d0e1ccd34a473c511c9c2357421825539965e41
Submitter: Zuul
Branch: master

commit 7d0e1ccd34a473c511c9c2357421825539965e41
Author: Slawek Kaplonski <email address hidden>
Date: Mon Nov 19 14:31:17 2018 +0100

    Get centralized FIP only on router's snat host

    It may happen that L3 agent works in dvr_snat mode but
    it handles some router as "normal" dvr router because
    snat for this router is handled on other node.
    In such case we shouldn't try to get floating IPs cidrs
    from snat namespace as it doesn't exists on host.

    Change-Id: Ib27dc223fcca56030ebb528625cc927fc60553e1
    Related-Bug: #1717302

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.openstack.org/620803

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/queens)

Related fix proposed to branch: stable/queens
Review: https://review.openstack.org/620804

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/pike)

Related fix proposed to branch: stable/pike
Review: https://review.openstack.org/620805

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/queens)

Reviewed: https://review.openstack.org/620804
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=968dba2aaa8c89a57e0fd3f78a2e174bca1a9aa3
Submitter: Zuul
Branch: stable/queens

commit 968dba2aaa8c89a57e0fd3f78a2e174bca1a9aa3
Author: Slawek Kaplonski <email address hidden>
Date: Mon Nov 19 14:31:17 2018 +0100

    Get centralized FIP only on router's snat host

    It may happen that L3 agent works in dvr_snat mode but
    it handles some router as "normal" dvr router because
    snat for this router is handled on other node.
    In such case we shouldn't try to get floating IPs cidrs
    from snat namespace as it doesn't exists on host.

    Change-Id: Ib27dc223fcca56030ebb528625cc927fc60553e1
    Related-Bug: #1717302
    (cherry picked from commit 7d0e1ccd34a473c511c9c2357421825539965e41)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/rocky)

Reviewed: https://review.openstack.org/620803
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=115a9f55583b1b762954b046947394b5a3032d52
Submitter: Zuul
Branch: stable/rocky

commit 115a9f55583b1b762954b046947394b5a3032d52
Author: Slawek Kaplonski <email address hidden>
Date: Mon Nov 19 14:31:17 2018 +0100

    Get centralized FIP only on router's snat host

    It may happen that L3 agent works in dvr_snat mode but
    it handles some router as "normal" dvr router because
    snat for this router is handled on other node.
    In such case we shouldn't try to get floating IPs cidrs
    from snat namespace as it doesn't exists on host.

    Change-Id: Ib27dc223fcca56030ebb528625cc927fc60553e1
    Related-Bug: #1717302
    (cherry picked from commit 7d0e1ccd34a473c511c9c2357421825539965e41)

tags: added: in-stable-rocky
tags: added: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-tempest-plugin (master)

Change abandoned by Slawek Kaplonski (<email address hidden>) on branch: master
Review: https://review.openstack.org/618557
Reason: It's not completely fixed yet for sure :(

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (stable/pike)

Reviewed: https://review.openstack.org/620805
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=6e3102b095413f05488868f7b0d3327bc70a5554
Submitter: Zuul
Branch: stable/pike

commit 6e3102b095413f05488868f7b0d3327bc70a5554
Author: Slawek Kaplonski <email address hidden>
Date: Mon Nov 19 14:31:17 2018 +0100

    Get centralized FIP only on router's snat host

    It may happen that L3 agent works in dvr_snat mode but
    it handles some router as "normal" dvr router because
    snat for this router is handled on other node.
    In such case we shouldn't try to get floating IPs cidrs
    from snat namespace as it doesn't exists on host.

    Change-Id: Ib27dc223fcca56030ebb528625cc927fc60553e1
    Related-Bug: #1717302
    (cherry picked from commit 7d0e1ccd34a473c511c9c2357421825539965e41)

Miguel Lavalle (minsel)
Changed in neutron:
status: In Progress → Fix Committed
Changed in neutron:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.