TestServerBasicOps: Test fails when deleting server and floating ip almost at the same time

Bug #1586931 reported by Monica Tian
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Ken'ichi Ohmichi
Liberty
Fix Released
Medium
Ken'ichi Ohmichi
Mitaka
Fix Released
Medium
Ken'ichi Ohmichi
OpenStack-Ansible
Fix Released
Undecided
Unassigned
neutron
Fix Released
Medium
Rossella Sblendido
tempest
Won't Fix
Undecided
Unassigned

Bug Description

In tempest.scenario.test_server_basic_ops.TestServerBasicOps.test_server_basic_ops, after last step: self.servers_client.delete_server(self.instance['id']) it doesn't wait for the server to be deleted, and then deletes the floating ip immediately in the clean up, this will cause faiure:

Here is the partial log:
2016-05-29 21:51:29.499 29791 INFO tempest.lib.common.rest_client [req-c3588ac4-21ca-47c3-bdb1-62088efd7a8b ] Request (TestServerBasicOps:test_server_basic_ops): 204 DELETE https://<url>:8774/v2/159886ce087a4f8fbfbcab14947d96b1/servers/6d44763b-ea79-4b5b-b57e-714191802c7c 0.465s
2016-05-29 21:51:29.499 29791 DEBUG tempest.lib.common.rest_client [req-c3588ac4-21ca-47c3-bdb1-62088efd7a8b ] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '204', 'content-length': '0', 'content-location': 'https://<url>:8774/v2/159886ce087a4f8fbfbcab14947d96b1/servers/6d44763b-ea79-4b5b-b57e-714191802c7c', 'date': 'Mon, 30 May 2016 02:51:29 GMT', 'x-compute-request-id': 'req-c3588ac4-21ca-47c3-bdb1-62088efd7a8b', 'content-type': 'application/json', 'connection': 'close'}
        Body: _log_request_full tempest/lib/common/rest_client.py:422
2016-05-29 21:51:30.410 29791 INFO tempest.lib.common.rest_client [req-db2323f5-3d58-4fd7-ae51-44f5525c6689 ] Request (TestServerBasicOps:_run_cleanups): 500 DELETE https://<url>:8774/v2/159886ce087a4f8fbfbcab14947d96b1/os-floating-ips/948912f6-ce03-4856-922b-59c4f16d3740 0.910s
2016-05-29 21:51:30.410 29791 DEBUG tempest.lib.common.rest_client [req-db2323f5-3d58-4fd7-ae51-44f5525c6689 ] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: None
    Response - Headers: {'status': '500', 'content-length': '224', 'content-location': 'https://<url>:8774/v2/159886ce087a4f8fbfbcab14947d96b1/os-floating-ips/948912f6-ce03-4856-922b-59c4f16d3740', 'date': 'Mon, 30 May 2016 02:51:30 GMT', 'x-compute-request-id': 'req-db2323f5-3d58-4fd7-ae51-44f5525c6689', 'content-type': 'application/json; charset=UTF-8', 'connection': 'close'}
        Body: {"computeFault": {"message": "Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.\n<class 'neutronclient.common.exceptions.PortNotFoundClient'>", "code": 500}}

Monica Tian (ytianyt)
Changed in tempest:
assignee: nobody → Monica Tian (ytianyt)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tempest (master)

Fix proposed to branch: master
Review: https://review.openstack.org/322731

Monica Tian (ytianyt)
summary: - TestServerBasicOps: Test fails with redundant step of deleting server
+ TestServerBasicOps: Test fails when deleting server and floating ip
+ almost at the same time
description: updated
Revision history for this message
Ken'ichi Ohmichi (oomichi) wrote :

This seems a problem of nova side because any internal errors should not happen on nova side even if any requests are gotten from client side(tempest in this case)

Changed in tempest:
status: In Progress → Won't Fix
Changed in tempest:
status: Won't Fix → Incomplete
Changed in tempest:
status: Incomplete → In Progress
Revision history for this message
Takashi Natsume (natsume-takashi) wrote :

Need the version of Nova.
Please tell it.

Changed in nova:
status: New → Incomplete
tags: added: api neutron
Revision history for this message
Rossella Sblendido (rossella-o) wrote :
Download full text (4.1 KiB)

I hit this. I can reproduce it almost every time on my env using linuxbrige+vxlan. The nova trace is:

2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions [req-bf41dac1-8fc0-4fd6-9a35-d754cea79057 9a0be6e4b8bf4cadb4a43401696fec19 48935f9a5ed84703973c70dd70859b7f - - -] Unexpected exception in API method
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions Traceback (most recent call last):
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions File "/usr/lib/python2.7/site-packages/nova/api/openstack/extensions.py", line 478, in wrapped
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions return f(*args, **kwargs)
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions File "/usr/lib/python2.7/site-packages/nova/api/openstack/compute/floating_ips.py", line 173, in delete
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions context, instance, floating_ip)
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 1527, in disassociate_and_release_floating_ip
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions raise_if_associated=False)
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions File "/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line 1536, in _release_floating_ip
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions client.delete_floatingip(fip['id'])
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 102, in with_params
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions ret = self.function(instance, *args, **kwargs)
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 751, in delete_floatingip
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions return self.delete(self.floatingip_path % (floatingip))
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 289, in delete
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions headers=headers, params=params)
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 270, in retry_request
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions headers=headers, params=params)
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 211, in do_request
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions self._handle_fault_response(status_code, replybody)
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions File "/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line 185, in _handle_fault_response
2016-07-02 16:20:52.670 9993 ERROR nova.api.openstack.extensions exception_handler_v20(status_code, des_error_body)
2016-07-02 16:20:52.6...

Read more...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/336842

Changed in neutron:
assignee: nobody → Rossella Sblendido (rossella-o)
status: New → In Progress
Changed in neutron:
importance: Undecided → Medium
Revision history for this message
Jean-Philippe Evrard (jean-philippe-evrard) wrote :

I'd be happy if this bug fix is backported to mitaka.

tags: added: liberty-backport-potential mitaka-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/338184

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/338200

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/336842
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=6e275e38575981386508718bf31e6f88b5a0887d
Submitter: Jenkins
Branch: master

commit 6e275e38575981386508718bf31e6f88b5a0887d
Author: rossella <email address hidden>
Date: Sat Jul 2 18:15:32 2016 +0000

    When deleting floating IP catch PortNotFound

    If we try to delete a VM and to delete the floating IP
    associated with the VM at the same time, depending
    on the order according to which these requests are processed
    Neutron might fail in the deletion of the floating IP,
    raising a PortNotFound error. This happens because Neutron
    notifies Nova of the network change event and it tries to get
    the port to which the FIP is associated. If the port is not there,
    Neutron shouldn't raise, it shouldn't send any notification.

    Change-Id: Ic72313ad1f787b3cb528e806c843f1fd01eb12f2
    Closes-bug: #1586931

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/338200
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d39d45ab4b9e7f8974e6018ad40def0d79a9909f
Submitter: Jenkins
Branch: stable/liberty

commit d39d45ab4b9e7f8974e6018ad40def0d79a9909f
Author: rossella <email address hidden>
Date: Sat Jul 2 18:15:32 2016 +0000

    When deleting floating IP catch PortNotFound

    If we try to delete a VM and to delete the floating IP
    associated with the VM at the same time, depending
    on the order according to which these requests are processed
    Neutron might fail in the deletion of the floating IP,
    raising a PortNotFound error. This happens because Neutron
    notifies Nova of the network change event and it tries to get
    the port to which the FIP is associated. If the port is not there,
    Neutron shouldn't raise, it shouldn't send any notification.

    Conflicts:
     neutron/notifiers/nova.py
     neutron/tests/unit/notifiers/test_nova.py

    Closes-bug: #1586931
    Change-Id: Ic72313ad1f787b3cb528e806c843f1fd01eb12f2
    (cherry picked from commit 6e275e38575981386508718bf31e6f88b5a0887d)

tags: added: in-stable-liberty
tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/mitaka)

Reviewed: https://review.openstack.org/338184
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9e403b18f03362872571c1b595df9567a7c39e29
Submitter: Jenkins
Branch: stable/mitaka

commit 9e403b18f03362872571c1b595df9567a7c39e29
Author: rossella <email address hidden>
Date: Sat Jul 2 18:15:32 2016 +0000

    When deleting floating IP catch PortNotFound

    If we try to delete a VM and to delete the floating IP
    associated with the VM at the same time, depending
    on the order according to which these requests are processed
    Neutron might fail in the deletion of the floating IP,
    raising a PortNotFound error. This happens because Neutron
    notifies Nova of the network change event and it tries to get
    the port to which the FIP is associated. If the port is not there,
    Neutron shouldn't raise, it shouldn't send any notification.

    Conflicts:
     neutron/notifiers/nova.py
     neutron/tests/unit/notifiers/test_nova.py

    Closes-bug: #1586931
    Change-Id: Ic72313ad1f787b3cb528e806c843f1fd01eb12f2
    (cherry picked from commit 6e275e38575981386508718bf31e6f88b5a0887d)

Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote :

> comment #3
tempest is always executed against master, IIUC. This means it's Nova master (Newton).

Changed in nova:
status: Incomplete → New
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 9.0.0.0b2

This issue was fixed in the openstack/neutron 9.0.0.0b2 development milestone.

Revision history for this message
Michael Gugino (gugino-michael) wrote :

We are seeing the 500 DELETE error from tempest as well.

As you can see here: http://logs.openstack.org/06/350806/4/check/gate-openstack-ansible-os_tempest-ansible-func-ubuntu-trusty/6a2c7cc/console.html#_2016-08-04_15_54_20_044407

We are deploying neutron from master.

Here is the exact error text: http://logs.openstack.org/06/350806/4/check/gate-openstack-ansible-os_tempest-ansible-func-ubuntu-trusty/6a2c7cc/console.html#_2016-08-04_16_06_59_878324

We have been seeing this issue pretty consistently for the last month+

The fix merged into neutron has had no effect.

Revision history for this message
Ken'ichi Ohmichi (oomichi) wrote :

This still seems due to Nova side because Nova should not return HTTP500 basically.

We start trying to fix this problem on nova side with the commit https://review.openstack.org/#/c/351960/

Changed in nova:
assignee: nobody → Ken'ichi Ohmichi (oomichi)
milestone: none → newton-3
Changed in nova:
status: New → In Progress
Revision history for this message
Michael Gugino (gugino-michael) wrote :

Thanks for looking into this. I think that your patch for nova is a good patch, nova should not produce 500 errors, but I think we're likely to just see the HTTPNotFound error in it's place, and that will likely cause a non-zero exit code for the tempest run. Will the HTTPNotFound error be handled by tempest?

Revision history for this message
Ken'ichi Ohmichi (oomichi) wrote :

Hi Michael,

Yeah, that is a nice point.
I am thinking this problem is due to the lack of error handling.
In cleanup of Tempest, Tempest checks the deletions of the created resources by receiving HTTP404(NotFound). However, now Nova returns HTTP500 and Tempest considers it as an error.
By handling this case rightly, I am hoping the test can pass.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/356095

Matt Riedemann (mriedem)
Changed in nova:
importance: Undecided → Medium
Changed in tempest:
status: In Progress → Won't Fix
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/351960
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e72826123bfd7c1d962b615da3f028b315ba3943
Submitter: Jenkins
Branch: master

commit e72826123bfd7c1d962b615da3f028b315ba3943
Author: Ken'ichi Ohmichi <email address hidden>
Date: Fri Aug 12 10:15:29 2016 -0700

    Return None in get_instance_id_by_floating_address

    _show_port() can raise a PortNotFound exception, but the method
    get_instance_id_by_floating_address() doesn't handle it. On the
    other hand, the method returns None if fip doesn't contain
    port_id as a normal case.
    On the caller side, "Delete a floating ip" API can use the returned
    value None to disassociate_and_release_floating_ip() and the method
    handles the None as a normal value.

    So this patch makes get_instance_id_by_floating_address return None
    if PortNotFound happens.

    Closes-Bug: #1586931

    Change-Id: I03be8100155d343eb6a4ea9eda3f1498ad3fb4cf

Changed in nova:
status: In Progress → Fix Released
Changed in openstack-ansible:
status: New → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/358152

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 7.1.2

This issue was fixed in the openstack/neutron 7.1.2 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 8.2.0

This issue was fixed in the openstack/neutron 8.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 14.0.0.0b3

This issue was fixed in the openstack/nova 14.0.0.0b3 development milestone.

Matt Riedemann (mriedem)
tags: removed: liberty-backport-potential mitaka-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/mitaka)

Reviewed: https://review.openstack.org/358153
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=daf281f95fa96aef72933280d873112b5e72807c
Submitter: Jenkins
Branch: stable/mitaka

commit daf281f95fa96aef72933280d873112b5e72807c
Author: Ken'ichi Ohmichi <email address hidden>
Date: Fri Aug 12 10:15:29 2016 -0700

    Return None in get_instance_id_by_floating_address

    _show_port() can raise a PortNotFound exception, but the method
    get_instance_id_by_floating_address() doesn't handle it. On the
    other hand, the method returns None if fip doesn't contain
    port_id as a normal case.
    On the caller side, "Delete a floating ip" API can use the returned
    value None to disassociate_and_release_floating_ip() and the method
    handles the None as a normal value.

    So this patch makes get_instance_id_by_floating_address return None
    if PortNotFound happens.

    Closes-Bug: #1586931

    Change-Id: I03be8100155d343eb6a4ea9eda3f1498ad3fb4cf
    (cherry picked from commit e72826123bfd7c1d962b615da3f028b315ba3943)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/liberty)

Change abandoned by Ken'ichi Ohmichi (<email address hidden>) on branch: stable/liberty
Review: https://review.openstack.org/358152
Reason: Now liberty branch is closing to Phase 3(http://docs.openstack.org/project-team-guide/stable-branches.html#support-phases), so this patch is not worth to be backport-ed at this time.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 13.1.2

This issue was fixed in the openstack/nova 13.1.2 release.

Revision history for this message
chandan kumar (chkumar246) wrote :

Unassigning this bug as there is no activity from last 6 months. If you are still want to work on this bug, feel free to assign yourself.

Changed in tempest:
assignee: Monica Tian (ytianyt) → nobody
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tempest (master)

Change abandoned by "Martin Kopec <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/tempest/+/322731
Reason: The associated bug is fixed, for tempest's side it's marked as Won't Fix and the latest comment in this review says the issue is fixed .. seems like this patch is not needed anymore, so i'm gonna abandon, feel free to restore and rebase if it's still needed

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.