Ovs agent fails to kill ovsdb monitor properly

Bug #1350903 reported by Eugene Nikanorov
50
This bug affects 10 people
Affects Status Importance Assigned to Milestone
neutron
Expired
Undecided
Unassigned

Bug Description

The following log is observed in one of the deployments:

2014-07-31 14:42:58.744 28084 DEBUG neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-0698c817-970d-4c56-bd72-90fb37f1a134 None] Agent caught SIGTERM, quitting daemon loop. _handle_sigterm /usr/lib/python2.7/dist-packages/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py:1366
2014-07-31 14:42:58.748 28084 ERROR neutron.agent.linux.ovsdb_monitor [req-0698c817-970d-4c56-bd72-90fb37f1a134 None] Error received from ovsdb monitor: 2014-07-31T14:42:58Z|00001|fatal_signal|WARN|terminating with signal 15 (Terminated)
2014-07-31 14:42:58.760 28084 DEBUG neutron.agent.linux.async_process [req-0698c817-970d-4c56-bd72-90fb37f1a134 None] Halting async process [['ovsdb-client', 'monitor', 'Interface', 'name,ofport', '--format=json']] in response to an error. _handle_process_error /usr/lib/python2.7/dist-packages/neutron/agent/linux/async_process.py:173
2014-07-31 14:42:58.763 28084 DEBUG neutron.agent.linux.utils [req-0698c817-970d-4c56-bd72-90fb37f1a134 None] Running command: ['ps', '--ppid', '28160', '-o', 'pid='] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:48
2014-07-31 14:42:58.880 28084 DEBUG neutron.agent.linux.utils [req-0698c817-970d-4c56-bd72-90fb37f1a134 None]
Command: ['ps', '--ppid', '28160', '-o', 'pid=']
Exit code: 1
Stdout: ''
Stderr: '' execute /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:74
2014-07-31 14:43:00.040 28084 DEBUG neutron.agent.linux.async_process [req-0698c817-970d-4c56-bd72-90fb37f1a134 None] Halting async process [['ovsdb-client', 'monitor', 'Interface', 'name,ofport', '--format=json']]. stop /usr/lib/python2.7/dist-packages/neutron/agent/linux/async_process.py:90
2014-07-31 14:43:00.052 28084 CRITICAL neutron [req-0698c817-970d-4c56-bd72-90fb37f1a134 None] AssertionError: Trying to re-send() an already-triggered event.
2014-07-31 14:43:00.052 28084 TRACE neutron Traceback (most recent call last):
2014-07-31 14:43:00.052 28084 TRACE neutron File "/usr/bin/neutron-openvswitch-agent", line 10, in <module>
2014-07-31 14:43:00.052 28084 TRACE neutron sys.exit(main())
2014-07-31 14:43:00.052 28084 TRACE neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py", line 1435, in main
2014-07-31 14:43:00.052 28084 TRACE neutron agent.daemon_loop()
2014-07-31 14:43:00.052 28084 TRACE neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py", line 1363, in daemon_loop
2014-07-31 14:43:00.052 28084 TRACE neutron self.rpc_loop(polling_manager=pm)
2014-07-31 14:43:00.052 28084 TRACE neutron File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2014-07-31 14:43:00.052 28084 TRACE neutron self.gen.next()
2014-07-31 14:43:00.052 28084 TRACE neutron File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/polling.py", line 41, in get_polling_manager
2014-07-31 14:43:00.052 28084 TRACE neutron pm.stop()
2014-07-31 14:43:00.052 28084 TRACE neutron File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/polling.py", line 108, in stop
2014-07-31 14:43:00.052 28084 TRACE neutron self._monitor.stop()
2014-07-31 14:43:00.052 28084 TRACE neutron File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/async_process.py", line 91, in stop
2014-07-31 14:43:00.052 28084 TRACE neutron self._kill()
2014-07-31 14:43:00.052 28084 TRACE neutron File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ovsdb_monitor.py", line 108, in _kill
2014-07-31 14:43:00.052 28084 TRACE neutron super(SimpleInterfaceMonitor, self)._kill(*args, **kwargs)
2014-07-31 14:43:00.052 28084 TRACE neutron File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/async_process.py", line 118, in _kill
2014-07-31 14:43:00.052 28084 TRACE neutron self._kill_event.send()
2014-07-31 14:43:00.052 28084 TRACE neutron File "/usr/lib/python2.7/dist-packages/eventlet/event.py", line 150, in send
2014-07-31 14:43:00.052 28084 TRACE neutron assert self._result is NOT_USED, 'Trying to re-send() an already-triggered event.'
2014-07-31 14:43:00.052 28084 TRACE neutron AssertionError: Trying to re-send() an already-triggered event.
2014-07-31 14:43:00.052 28084 TRACE neutron

Tags: ovs
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/114154

Changed in neutron:
assignee: Eugene Nikanorov (enikanorov) → Zang MingJie (zealot0630)
status: New → In Progress
Revision history for this message
Thiago Martins (martinx) wrote :

Hey guys,

I think I'm facing this problem too, I'm running IceHouse on plain Ubuntu 14.04.1. VLAN Provider Networks (no GRE / VXLAN)...

Error log:

---
==> /var/log/neutron/openvswitch-agent.log <==
2014-09-02 20:18:13.665 5503 ERROR neutron.agent.linux.ovsdb_monitor [req-91a64dfc-0902-49e5-8d55-e9bcd31d69b3 None] Error received from ovsdb monitor: 2014-09-02T23:18:13Z|00001|fatal_signal|WARN|terminating with signal 15 (Terminated)
2014-09-02 20:18:15.492 5503 CRITICAL neutron [req-91a64dfc-0902-49e5-8d55-e9bcd31d69b3 None] Trying to re-send() an already-triggered event.
---

Is there any workaround or something else to do!?

Tks!
Thiago

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Kyle Mestery (<email address hidden>) on branch: master
Review: https://review.openstack.org/114154
Reason: This change is old enough and hasn't seen any updates since August 19, 2014. Abandoning it, please revive it if you plan to work on it again.

Revision history for this message
Antonio Messina (arcimboldo) wrote :

I'm affected by this bug too, running Kilo on Ubuntu 14.04

Is there anybody already working on this patch?

Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

It affects fullstack runs too: for every OVS agent, ovsdb-client monitor process is left, eventually using all connections to ovsdb, and consequent tests fail with Protocol error until we kill those processes.

Jian Wen (wenjianhn)
Changed in neutron:
status: In Progress → Confirmed
Revision history for this message
JohnsonYi (yichengli) wrote :
Download full text (4.2 KiB)

Fuel 6.0 has the same problem:

dpkg -l | grep neutron
ii neutron-common 1:2014.2-fuel6.0~mira28 Neutron is a virtual network service for Openstack - common
ii neutron-plugin-ml2 1:2014.2-fuel6.0~mira28 Neutron is a virtual network service for Openstack - ML2 plugin
ii neutron-plugin-openvswitch-agent 1:2014.2-fuel6.0~mira28 Neutron is a virtual network service for Openstack - Open vSwitch plugin agent
ii python-neutron 1:2014.2-fuel6.0~mira28 Neutron is a virutal network service for Openstack - Python library
ii python-neutronclient 1:2.3.9-fuel6.0~mira18 client - Neutron is a virtual network service for Openstack

Error logs:
tail -f /var/log/neutron/ovs-agent.log
2016-01-22 06:17:18.548 2564 INFO oslo.messaging._drivers.impl_rabbit [req-7dcc76d5-0f18-4dbb-ac72-43107c4ec341 ] Connecting to AMQP server on 10.14.82.3:5673
2016-01-22 06:17:18.579 2564 INFO oslo.messaging._drivers.impl_rabbit [req-7dcc76d5-0f18-4dbb-ac72-43107c4ec341 ] Connected to AMQP server on 10.14.82.3:5673
2016-01-22 06:17:18.649 2564 INFO oslo.messaging._drivers.impl_rabbit [req-7dcc76d5-0f18-4dbb-ac72-43107c4ec341 ] Connecting to AMQP server on 10.14.82.3:5673
2016-01-22 06:17:18.679 2564 INFO oslo.messaging._drivers.impl_rabbit [req-7dcc76d5-0f18-4dbb-ac72-43107c4ec341 ] Connected to AMQP server on 10.14.82.3:5673
2016-01-22 06:17:21.303 2564 WARNING neutron.openstack.common.loopingcall [-] task run outlasted interval by 0.226028 sec
2016-01-22 06:19:55.351 2564 ERROR neutron.agent.linux.utils [-]
Command: ['ps', '--ppid', '3080', '-o', 'pid=']
Exit code: 1
Stdout: ''
Stderr: ''
2016-01-22 06:19:56.192 2564 CRITICAL neutron [req-7dcc76d5-0f18-4dbb-ac72-43107c4ec341 None] AssertionError: Trying to re-send() an already-triggered event.
2016-01-22 06:19:56.192 2564 TRACE neutron Traceback (most recent call last):
2016-01-22 06:19:56.192 2564 TRACE neutron File "/usr/bin/neutron-openvswitch-agent", line 10, in <module>
2016-01-22 06:19:56.192 2564 TRACE neutron sys.exit(main())
2016-01-22 06:19:56.192 2564 TRACE neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py", line 1550, in main
2016-01-22 06:19:56.192 2564 TRACE neutron agent.daemon_loop()
2016-01-22 06:19:56.192 2564 TRACE neutron File "/usr/lib/python2.7/dist-packages/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py", line 1477, in daemon_loop
2016-01-22 06:19:56.192 2564 TRACE neutron self.rpc_loop(polling_manager=pm)
2016-01-22 06:19:56.192 2564 TRACE neutron File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2016-01-22 06:19:56.192 2564 TRACE neutron self.gen.next()
2016-01-22 06:19:56.192 2564 TRACE neutron File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/polling.py", line 39, in get_polling_manager
2016-01-22 06:19:56.192 2564 TRACE neutron pm.stop()
2016-01-22 06:19:56.192 2564 TRACE neutron File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/polling.py", line 106, in stop
2016-01-22 06:19:56.192 2564 TRACE neutron self._monitor....

Read more...

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote : Cleanup EOL bug report

This is an automated cleanup. This bug report has been closed because it
is older than 18 months and there is no open code change to fix this.
After this time it is unlikely that the circumstances which lead to
the observed issue can be reproduced.

If you can reproduce the bug, please:
* reopen the bug report (set to status "New")
* AND add the detailed steps to reproduce the issue (if applicable)
* AND leave a comment "CONFIRMED FOR: <RELEASE_NAME>"
  Only still supported release names are valid (INCUBATOR-JUNO, LIBERTY, MITAKA, NEWTON).
  Valid example: CONFIRMED FOR: INCUBATOR-JUNO

Changed in neutron:
assignee: Zang MingJie (zealot0630) → nobody
importance: Medium → Undecided
status: Confirmed → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.