tempest fail due to neutron cache miss

Bug #1252849 reported by Bhuvan Arumugam
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Aaron Rosen
Havana
Fix Released
High
Aaron Rosen

Bug Description

Sounds like a regression caused by following commits:
  1. https://github.com/openstack/nova/commit/1957339df302e2da75e0dbe78b5d566194ab2c08
  2. https://github.com/openstack/nova/commit/651fac3d5d250d42e640c3ac113084bf0d2fa3b4

The above patches causing 2 issues:
  1. tempest.api.compute.servers.test_server_actions test fail leaving the servers in ERROR state.
  2. Unable to delete those servers using nova delete.

Tempest and compute traceback here:
  http://pastebin.com/CVjG03eV

The patch was to disable cache refresh in allocate_for_instance, allocate_port_for_instance and deallocate_port_for_instance methods. This is breaking the nova boot process when the servers are created using the above tempest test. However, we could create servers using nova boot api, manually. It is likely because the cache is disabled while allocating the instance.

The servers created using above test is left in ERROR state and we are unable to delete them. It is likely because the cache is disabled while deallocating the instance and/or port.

NOTE:
If we restore the @refresh_sync decorator against those methods and do not use the decorate in _get_instance_nw_info, the the above tempest test is successful. I have not tested it when deleting the VM.

Bhuvan Arumugam (bhuvan)
description: updated
Aaron Rosen (arosen)
Changed in nova:
assignee: nobody → Aaron Rosen (arosen)
Revision history for this message
Aaron Rosen (arosen) wrote :

From from nova-compute exception the error seemed to occur in the python-neutronclient:

  File "/usr/local/csi/share/csi-nova.venv/lib/python2.6/site-packages/nova/network/neutronv2/api.py", line 136, in _get_available_networks
    nets = neutron.list_networks(**search_opts).get('networks', [])
  File "/usr/local/csi/share/csi-nova.venv/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 108, in with_params
    ret = self.function(instance, *args, **kwargs)
  File "/usr/local/csi/share/csi-nova.venv/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 325, in list_networks
    **_params)
  File "/usr/local/csi/share/csi-nova.venv/lib/python2.6/site-packages/neutronclient/v2_0/client.py", line 1198, in list
    res.extend(r[collection])
KeyError: 'networks'

Is there any chance we could also get the neutron-server logs? I'm still searching for what caused this though.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/57711

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Related fix proposed to branch: master
Review: https://review.openstack.org/57921

Changed in nova:
status: New → In Progress
importance: Undecided → High
Revision history for this message
Bhuvan Arumugam (bhuvan) wrote :

I didn't have the old setup.

I setup this environment by installing new csi-nova and I could replicate this issue at ease. Unfortunately, neutron log doesn't have any error. This time test_servers test failed. The following pastebin has tempest error, compute error and neutron log (no error). Looks like the /networks.json call return empty response.

  http://paste.openstack.org/show/53825/

Revision history for this message
Aaron Rosen (arosen) wrote :

Hi Bhuvan,

I think we have this fixed now but the changes hasn't merged yet. If you get a chance would you mind running:

git fetch https://review.openstack.org/openstack/nova refs/changes/11/57711/2 && git checkout FETCH_HEAD

from your neutron git repo to get the series of patches that fix this issue and see if the test passes now?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/57711
Committed: http://github.com/openstack/nova/commit/4c03383f21bc13caf3fed4db5aa1317d37758d5c
Submitter: Jenkins
Branch: master

commit 4c03383f21bc13caf3fed4db5aa1317d37758d5c
Author: Aaron Rosen <email address hidden>
Date: Thu Nov 21 07:57:44 2013 -0800

    Do not hide exception in update_instance_cache_with_nw_info

    From time to time an exception is raised in this method causing
    the nw_info cache not to be saved. If this occurs we should raise
    as this error will cause later errors to occur. For example, one
    won't be able to associate a floatingip with the instance as there
    is no nw_info found in this table. In addition, the fixed_ips on
    the instance won't be returned via the api.

    This patch also stubs out update_instance_cache_with_nw_info in a
    few tests where an exception was being raised previously but went
    unnoticed as it was not reraised but now is.

    Related-Bug: #1252849
    Related-Bug: #1249065

    Change-Id: Ic860f72210ba736e11c10df21c4cb7625e9c0928

Revision history for this message
Bhuvan Arumugam (bhuvan) wrote :

Aaron, thank you for the patch.

With the newer fixes, we don't face this issue. The tempest is successful. We could create new instances.

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
Aaron Rosen (arosen) wrote :

Awesome thanks for confirming Bhuvan!

Changed in nova:
milestone: none → icehouse-1
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/havana)

Related fix proposed to branch: stable/havana
Review: https://review.openstack.org/73202

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/havana)

Reviewed: https://review.openstack.org/73202
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=53acc09fb9b3ffe7c69bdc484c8cf56032182e28
Submitter: Jenkins
Branch: stable/havana

commit 53acc09fb9b3ffe7c69bdc484c8cf56032182e28
Author: Aaron Rosen <email address hidden>
Date: Thu Nov 21 07:57:44 2013 -0800

    Do not hide exception in update_instance_cache_with_nw_info

    From time to time an exception is raised in this method causing
    the nw_info cache not to be saved. If this occurs we should raise
    as this error will cause later errors to occur. For example, one
    won't be able to associate a floatingip with the instance as there
    is no nw_info found in this table. In addition, the fixed_ips on
    the instance won't be returned via the api.

    This patch also stubs out update_instance_cache_with_nw_info in a
    few tests where an exception was being raised previously but went
    unnoticed as it was not reraised but now is.

    Related-Bug: #1252849
    Related-Bug: #1249065

    Change-Id: Ic860f72210ba736e11c10df21c4cb7625e9c0928
    (cherry picked from commit 4c03383f21bc13caf3fed4db5aa1317d37758d5c)

tags: added: in-stable-havana
Thierry Carrez (ttx)
Changed in nova:
milestone: icehouse-1 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.