Resource has ERROR status: Exceeded maximum number of retries

Bug #1598204 reported by Andrew Kalach
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Invalid
High
Andrew Kalach

Bug Description

Description
===========
During 9.0RC2 Certification Testing Sutie run, the the following tasks:

  NovaSecGroup.boot_and_delete_server_with_secgroups
  NovaServers.boot_server_from_volume_and_delete

failed with error:

Resource <Server: s_rally_02cfda0f_OtKNtNS8> has ERROR status. Fault: {u'message': u'Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 14ccca5a-e257-46f2-b182-eb5357c109e7. Last exception: Binding failed for port 1802100d-0262-4597-ba41-f0d32e213dbd, please check neutron logs for more information.', u'code': 500, u'created': u'2016-06-28T01:58:42Z'}

Error task log (NovaSecGroup.boot_and_delete_server_with_secgroups):

2016-06-28 01:58:17.619 30929 DEBUG novaclient.v2.client [-] GET call to compute for http://172.16.184.3:8774/v2.1/servers/e3a9f1fd-d7a9-43af-a911-903297545f4e used request id req-33956e34-3b21-4f97-a0ae-fc975f9cc6b5 _log_request_id /opt/stack/.venv/lib/python2.7/site-packages/novaclient/client.py:85
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner [-] Resource <Server: s_rally_02cfda0f_nOe66YVK> has ERROR status.
Fault: {u'message': u'Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance e3a9f1fd-d7a9-43af-a911-903297545f4e. Last exception: Binding failed for port c3c6e915-3395-4d6f-90bf-02a565da4146, please check neutron logs for more information.', u'code': 500, u'created': u'2016-06-28T01:58:16Z'}
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner Traceback (most recent call last):
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner File "/opt/stack/.venv/lib/python2.7/site-packages/rally/task/runner.py", line 66, in _run_scenario_once
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner deprecated_output = getattr(scenario_inst, method_name)(**kwargs)
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner File "/opt/stack/.venv/lib/python2.7/site-packages/rally/plugins/openstack/scenarios/nova/security_group.py", line 137, in boot_and_delete_server_with_secgroups
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner **kwargs)
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner File "/opt/stack/.venv/lib/python2.7/site-packages/rally/task/atomic.py", line 84, in func_atomic_actions
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner f = func(self, *args, **kwargs)
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner File "/opt/stack/.venv/lib/python2.7/site-packages/rally/plugins/openstack/scenarios/nova/utils.py", line 147, in _boot_server
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner check_interval=CONF.benchmark.nova_server_boot_poll_interval
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner File "/opt/stack/.venv/lib/python2.7/site-packages/rally/common/logging.py", line 236, in wrapper
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner return f(*args, **kwargs)
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner File "/opt/stack/.venv/lib/python2.7/site-packages/rally/task/utils.py", line 147, in wait_for
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner check_interval=check_interval)
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner File "/opt/stack/.venv/lib/python2.7/site-packages/rally/task/utils.py", line 211, in wait_for_status
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner resource = update_resource(resource)
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner File "/opt/stack/.venv/lib/python2.7/site-packages/rally/task/utils.py", line 90, in _get_from_manager
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner fault=getattr(res, "fault", "n/a"))
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner GetResourceErrorStatus: Resource <Server: s_rally_02cfda0f_nOe66YVK> has ERROR status.
2016-06-28 01:58:17.623 30929 ERROR rally.task.runner Fault: {u'message': u'Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance e3a9f1fd-d7a9-43af-a911-903297545f4e. Last exception: Binding failed for port c3c6e915-3395-4d6f-90bf-02a565da4146, please check neutron logs for more information.', u'code': 500, u'created': u'2016-06-28T01:58:16Z'}

Steps to reproduce
==================
Install Rally on masternode and run task "NovaSecGroup.boot_and_delete_server_with_secgroups":

1. cd /opt/stack
2. source .venv/bin/activate
3. rally task start /opt/stack/rally-scenarios/nova/boot_and_delete_server_with_secgroups.yaml

Expected result
===============
Task run without errors

Actual result
=============
Errors described above in logs

Environment
===========
Scale 200-node lab (ENV-10):
  3 controllers
  199 compute nodes
  20 ceph nodes
  DVR enabled
  XVLAN
Image RC2 build:
  fuel-9.0-mos-495-2016-06-16_18-18-00.iso

Diagnostic snapshot for more detailed information:
  http://mos-scale-share.mirantis.com/fuel-snapshot-2016-06-28_13-13-50.tar.gz

Tags: area-nova
Revision history for this message
Andrew Kalach (akndex) wrote :
Revision history for this message
Andrew Kalach (akndex) wrote :
description: updated
Dina Belova (dbelova)
tags: added: area-nova
Changed in mos:
assignee: nobody → MOS Nova (mos-nova)
milestone: none → 9.1
status: New → Confirmed
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :
Download full text (3.4 KiB)

Unfortunately, the snapshot seems to be broken :(

fuel-snapshot-2016-06-28_13-13-50/fuel.domain.tld/var/log/remote/node-96.domain.tld/nova-compute.log

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

The existing logs only say that port binding has failed:

fuel-snapshot-2016-06-28_13-13-50/node-193/var/log/nova/nova-compute.log.1:2016-06-28 01:58:39.995 21569 DEBUG keystoneauth.session [req-40883a9c-af4d-48a2-8823-8e4a1a5f6c5f 54cb49fe136743ed8d22df7dc8fab797 e1350af239d24a7ebe9d3aac94faf135 - - -] REQ: curl -g -i -X DELETE http://192.168.0.2:9696/v2.0/ports/1802100d-0262-4597-ba41-f0d32e213dbd.json -H "User-Agent: python-neutronclient" -H "Accept: application/json" -H "X-Auth-Token: {SHA1}f9af27c8bb1cae2f1fd0a3624f165d62f40b552e" _http_log_request /usr/lib/python2.7/dist-packages/keystoneauth1/session.py:248
fuel-snapshot-2016-06-28_13-13-50/node-193/var/log/nova/nova-compute.log.1:2016-06-28 01:58:40.679 21569 ERROR nova.compute.manager PortBindingFailed: Binding failed for port 1802100d-0262-4597-ba41-f0d32e213dbd, please check neutron logs for more information.
fuel-snapshot-2016-06-28_13-13-50/node-193/var/log/nova/nova-compute.log.1:2016-06-28 01:58:40.680 21569 ERROR nova.compute.manager [instance: 14ccca5a-e257-46f2-b182-eb5357c109e7] PortBindingFailed: Binding failed for port 1802100d-0262-4597-ba41-f0d32e213dbd, please check neutron logs for more information.

but do not specify the reason.

The only Neutron logs I found are for deletion of the corresponding port:

fuel-snapshot-2016-06-28_13-13-50/node-107/var/log/neutron/neutron-openvswitch-agent.log.1:2016-06-28 01:58:40.581 31035 DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-0fa0e77c-27e1-4cab-97ef-1482d1ac3284 4fd5db6f920b4499bfd4b757e9c0d468 fe46a113993941c5b0f7a2d6747fe417 - - -] port_delete message processed for port 1802100d-0262-4597-ba41-f0d32e213dbd port_delete /usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:413
fuel-snapshot-2016-06-28_13-13-50/node-107/var/log/neutron/neutron-openvswitch-agent.log.1:2016-06-28 01:58:42.232 31035 DEBUG neutron.agent.linux.utils [req-d5b2a0bc-3693-45d1-aafb-709d671b292c - - - - -] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=10', '--oneline', '--format=json', '--', '--columns=external_ids,name,ofport', 'find', 'Interface', 'external_ids:iface-id=1802100d-0262-4597-ba41-f0d32e213dbd', 'external_ids:attached-mac!=""'] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:84
fuel-snapshot-2016-06-28_13-13-50/node-107/var/log/neutron/neutron-openvswitch-agent.log.1:2016-06-28 01:58:42.310 31035 INFO neutron.agent.common.ovs_lib [req-d5b2a0bc-3693-45d1-aafb-709d671b292c - - - - -] Port 1802100d-0262-4597-ba41-f0d32e213dbd not present in bridge br-int
fuel-snapshot-2016-06-28_13-13-50/node-107/var/log/neutron/neutron-openvswitch-agent.log.1:2016-06-28 01:58:42.468 31035 INFO neutron.agent.securitygroups_rpc [req-d5b2a0bc-3693-45d1-aafb-709d671b292c - - - - -] Remove device filter for [u'18021...

Read more...

Changed in mos:
status: Confirmed → Incomplete
assignee: MOS Nova (mos-nova) → Andrew Kalach (akndex)
Revision history for this message
Denis Meltsaykin (dmeltsaykin) wrote :

Moving to Invalid after a month in Incomplete.

Changed in mos:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.