Floating IP status failed to transition to DOWN in neutron-tempest-plugin-scenario-linuxbridge

Bug #1815585 reported by Slawek Kaplonski on 2019-02-12
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
High
Unassigned

Bug Description

From time to time in neutron-tempest-plugin-scenario-linuxbridge job test neutron_tempest_plugin.scenario.test_floatingip.FloatingIPPortDetailsTest.test_floatingip_port_details is failing because of error like:

2019-02-12 09:19:02.673764 | controller | Captured traceback:
2019-02-12 09:19:02.673899 | controller | ~~~~~~~~~~~~~~~~~~~
2019-02-12 09:19:02.674023 | controller | b'Traceback (most recent call last):'
2019-02-12 09:19:02.674267 | controller | b' File "/opt/stack/neutron-tempest-plugin/neutron_tempest_plugin/scenario/test_floatingip.py", line 247, in test_floatingip_port_details'
2019-02-12 09:19:02.674429 | controller | b" fip = self._wait_for_fip_port_down(fip['id'])"
2019-02-12 09:19:02.674664 | controller | b' File "/opt/stack/neutron-tempest-plugin/neutron_tempest_plugin/scenario/test_floatingip.py", line 317, in _wait_for_fip_port_down'
2019-02-12 09:19:02.674814 | controller | b' raise exceptions.TimeoutException(message)'
2019-02-12 09:19:02.674975 | controller | b'tempest.lib.exceptions.TimeoutException: Request timed out'
2019-02-12 09:19:02.675289 | controller | b'Details: Floating IP 5547d655-01e0-4f14-a375-fd805a689299 attached port status failed to transition to DOWN (current status ACTIVE) within the required time (120 s).'
2019-02-12 09:19:02.675390 | controller | b''

Example of failure: http://logs.openstack.org/91/635691/1/gate/neutron-tempest-plugin-scenario-linuxbridge/3cc34c4/job-output.txt.gz#_2019-02-12_09_19_02_673764

Logstash query: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22attached%20port%20status%20failed%20to%20transition%20to%20DOWN%20(current%20status%20ACTIVE)%20within%20the%20required%20time%5C%22

Brian Haley (brian-haley) wrote :

The change in https://review.openstack.org/#/c/602694/ made this a little better by waiting, but I can confirm that this test still does fail occasionally.

Slawek Kaplonski (slaweq) wrote :

I did some initial investigation on this. It looks that nova-compute received information to detach interface and told Libvirt to detach it:
http://logs.openstack.org/91/635691/1/gate/neutron-tempest-plugin-scenario-linuxbridge/3cc34c4/controller/logs/screen-n-cpu.txt.gz#_Feb_12_09_16_39_511365

So neutron-linuxbridge-agent found this event and removed port here: http://logs.openstack.org/91/635691/1/gate/neutron-tempest-plugin-scenario-linuxbridge/3cc34c4/controller/logs/screen-q-agt.txt.gz#_Feb_12_09_16_39_951803

And there is also matched even in neutron-server to switch port to DOWN:
http://logs.openstack.org/91/635691/1/gate/neutron-tempest-plugin-scenario-linuxbridge/3cc34c4/controller/logs/screen-q-svc.txt.gz#_Feb_12_09_16_39_973353

According to that it looks that all was fine and port should be set to DOWN but for some reason port status in response from neutron looked like:

{"port":{"id":"7b6ccd25-cc09-41a3-ba4b-b23169819c11","name":"","network_id":"0976b2ae-c8c4-4d3c-a1df-bbbded993f28","tenant_id":"5ecab117503f46b7af1a7e6c3707c2e7","mac_address":"fa:16:3e:60:df:15","admin_state_up":true,"status":"ACTIVE","fixed_ips":[{"subnet_id":"ad2dc4a4-1b02-45f9-9042-60b5788cd2d9","ip_address":"10.1.0.12"}],"device_id":"","device_owner":"","allowed_address_pairs":[],"extra_dhcp_opts":[],"security_groups":["fd9de87a-7f94-4def-ae63-871b5f220abc"],"description":"","binding:vnic_type":"normal","port_security_enabled":true,"propagate_uplink_status":false,"qos_policy_id":null,"dns_name":"","dns_assignment":[{"ip_address":"10.1.0.12","hostname":"host-10-1-0-12","fqdn":"host-10-1-0-12.openstackgate.local."}],"dns_domain":"","ip_allocation":"immediate","tags":[],"created_at":"2019-02-12T09:15:16Z","updated_at":"2019-02-12T09:16:47Z","revision_number":9,"project_id":"5ecab117503f46b7af1a7e6c3707c2e7"}}

It is here:
http://logs.openstack.org/91/635691/1/gate/neutron-tempest-plugin-scenario-linuxbridge/3cc34c4/controller/logs/tempest_log.txt.gz#_2019-02-12_09_16_50_122

Please note that port has empty device_id but status ACTIVE

Slawek Kaplonski (slaweq) wrote :

According to last comment. I also checked that port status was actually changed from "ACTIVE to DOWN" because there was event to nova send here:
http://logs.openstack.org/91/635691/1/gate/neutron-tempest-plugin-scenario-linuxbridge/3cc34c4/controller/logs/screen-q-svc.txt.gz#_Feb_12_09_16_41_263557

And such event can be send only if port status was changed from ACTIVE to DOWN: https://github.com/openstack/neutron/blob/master/neutron/notifiers/nova.py#L194

So now question is, what switched this port to ACTIVE again?

Reviewed: https://review.openstack.org/639067
Committed: https://git.openstack.org/cgit/openstack/neutron-tempest-plugin/commit/?id=7451ad7fe0fede24c15e39ecd15e46dd6fbf4d86
Submitter: Zuul
Branch: master

commit 7451ad7fe0fede24c15e39ecd15e46dd6fbf4d86
Author: Slawek Kaplonski <email address hidden>
Date: Mon Feb 25 12:02:49 2019 +0100

    Log port details when FloatingIP port details test fails

    In case when port status is not transitioned to DOWN in
    test_floatingip.FloatingIPPortDetailsTest tests, log of
    all port info can be useful for further debugging.

    Change-Id: I0ba1efea68e6e53bd7df9bf84ecb3d73e95925ca
    Related-bug: #1815585

Slawek Kaplonski (slaweq) wrote :

It happened again. And indeed it confirmed what I wrote in my previous comments here. Port is ACTIVE but it's "unbound" and device_id is empty.

Now we need to find out why it's active in such case.

Slawek Kaplonski (slaweq) wrote :

Maybe it is somehow related to https://bugs.launchpad.net/neutron/+bug/1819446 but I will have to investigate that.

Reviewed: https://review.openstack.org/647816
Committed: https://git.openstack.org/cgit/openstack/neutron-tempest-plugin/commit/?id=5ddcf33322f1e257322369fff6416aa48c4214b9
Submitter: Zuul
Branch: master

commit 5ddcf33322f1e257322369fff6416aa48c4214b9
Author: Slawek Kaplonski <email address hidden>
Date: Tue Mar 26 18:00:58 2019 +0100

    Mark test_floatingip_port_details test as unstable

    Change-Id: I52f5552807c889c63a8583b2bfc124e9c89fca6f
    Related-Bug: #1815585

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers