"Exception: Port tapcea51630-e1 is not ready, resync needed" spamming neutron agent logs

Bug #1514935 reported by Matt Riedemann
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Carl Baldwin

Bug Description

Due to this change:

https://review.openstack.org/#/c/164880/

We see a ton of these errors in the neutron agent logs in gate runs:

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message:%5C%22Exception:%20Port%5C%22%20AND%20message:%5C%22is%20not%20ready,%20resync%20needed%5C%22%20AND%20tags:%5C%22screen-q-agt.txt%5C%22

http://logs.openstack.org/85/239885/2/gate/gate-tempest-dsvm-neutron-full/602d864/logs/screen-q-agt.txt.gz?level=TRACE#_2015-11-10_17_09_45_965

2015-11-10 17:09:45.965 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-15a73753-1512-4689-9404-9658a0cd0c09 None None] Error while processing VIF ports
2015-11-10 17:09:45.965 27715 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Traceback (most recent call last):
2015-11-10 17:09:45.965 27715 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 1868, in rpc_loop
2015-11-10 17:09:45.965 27715 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent consecutive_resyncs)
2015-11-10 17:09:45.965 27715 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 1791, in process_port_info
2015-11-10 17:09:45.965 27715 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent ancillary_ports, updated_ports_copy))
2015-11-10 17:09:45.965 27715 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 1271, in process_ports_events
2015-11-10 17:09:45.965 27715 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent ancillary_port_info['added'])
2015-11-10 17:09:45.965 27715 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 1262, in _process_device
2015-11-10 17:09:45.965 27715 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 'name'])
2015-11-10 17:09:45.965 27715 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Exception: Port tapcea51630-e1 is not ready, resync needed
2015-11-10 17:09:45.965 27715 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent

Revision history for this message
Matt Riedemann (mriedem) wrote :

Note that the error was introduced with https://review.openstack.org/#/c/164880/.

Matt Riedemann (mriedem)
tags: added: logging
no longer affects: nova
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :
Changed in neutron:
status: New → Confirmed
importance: Undecided → High
tags: added: gate-failure
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Gathering more data points with revert https://review.openstack.org/#/c/243808/

Revision history for this message
Carl Baldwin (carl-baldwin) wrote :
Revision history for this message
Sean Dague (sdague) wrote :

It's pretty clear that 243808 is the fix. If you look back on the Jenkins fails for https://review.openstack.org/#/c/164880/ this was actually triggered in Patch Set 33: http://logs.openstack.org/80/164880/33/gate/gate-tempest-dsvm-neutron-full/56c541a/console.html.gz#_2015-10-21_19_57_34_141

Which blocked it from landing in the gate.

Then YAMAMOTO Takashi did a recheck to rerun the tests to land it.

It failed again on check - http://logs.openstack.org/80/164880/33/check/gate-tempest-dsvm-neutron-full/4e68240/console.html.gz#_2015-10-22_02_12_35_331

Rosella followed up with a correct observation that the fails might have been related to her patch and put it into workflow-1 state.

Patch 34 was pushed, and got lucky and passed check and was approved to gate.

Unfortunately at the 35% race rate, it means this was just litterally winning a coin flip. We got the 42% chance that it would pass both times (0.65^2).

Probably in future it would be better to run some manual rechecks on a patch like this preemptively to attempt to trigger the failure when it's been known to trip things in the past.

Matt Riedemann (mriedem)
Changed in neutron:
assignee: nobody → Armando Migliaccio (armando-migliaccio)
status: Confirmed → In Progress
Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :
Changed in neutron:
assignee: Armando Migliaccio (armando-migliaccio) → Carl Baldwin (carl-baldwin)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/243808
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=e7270d9505fe09ec6d687e7796a814bf883b5416
Submitter: Jenkins
Branch: master

commit e7270d9505fe09ec6d687e7796a814bf883b5416
Author: Armando Migliaccio <email address hidden>
Date: Tue Nov 10 20:03:10 2015 +0000

    Revert "OVS agent reacts to events instead of polling"

    This might be associated to manifestation of bug #1514935

    This reverts commit 1992d52d63dc32c63faa5a3f482d5b8ebe925a77.

    Closes-Bug: #1514935
    Change-Id: If01cc87b6735e1bc039f99c4c6121e7c5ce547d0

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/neutron 8.0.0.0b1

This issue was fixed in the openstack/neutron 8.0.0.0b1 development milestone.

Changed in neutron:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.