compute neutron agent fails to come up

Bug #1290490 reported by Jon-Paul Sullivan
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Expired
High
Unassigned

Bug Description

Taken from screen output:

++ wait_for 30 10 neutron agent-list -f csv -c alive -c agent_type -c host '|' grep '":-).*Open vSwitch agent.*overcloud-novacompute"'^M
Timing out after 300 seconds:^M
COMMAND=neutron agent-list -f csv -c alive -c agent_type -c host | grep ":-).*Open vSwitch agent.*overcloud-novacompute"^M
OUTPUT=^M

It appears that neutron has not started correctly on the compute node, but the processes are there:

root@overcloud-novacompute0-febd67yd6yid:~# ps auwwx | grep neutron
neutron 2925 0.0 0.0 86824 28132 ? Ss 15:55 0:04 /opt/stack/venvs/neutron/bin/python /opt/stack/venvs/neutron/bin/neutron-openvswitch-agent --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini --config-dir /etc/neutron
root 3143 0.0 0.0 53532 1952 ? S 15:55 0:00 sudo /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf ovsdb-client monitor Interface name,ofport --format=json
root 3144 0.0 0.0 32760 7536 ? S 15:55 0:00 /opt/stack/venvs/neutron/bin/python /usr/bin/neutron-rootwrap /etc/neutron/rootwrap.conf ovsdb-client monitor Interface name,ofport --format=json
root 5937 0.0 0.0 8176 944 pts/5 S+ 18:15 0:00 grep --color=auto neutron

Revision history for this message
Robert Collins (lifeless) wrote : Re: Baremetal: Cannot ping created VM in nova

The title and body of this bug confuse me :) I"m going to retitle it to match the body.

summary: - Baremetal: Cannot ping craeted VM in nova
+ Baremetal: Cannot ping created VM in nova
summary: - Baremetal: Cannot ping created VM in nova
+ compute neutron agent fails to come up
Revision history for this message
Robert Collins (lifeless) wrote :

What does neutron agent-list report, and whats in the log files for the agents on the compute node?

Changed in tripleo:
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
Jon-Paul Sullivan (jonpaul-sullivan) wrote :

When this bug was added there was no open vswitch agent on the compute node.

After leaving it overnight, there is one.

root@overcloud-notcompute0-62wihvoo4ajr:~# neutron agent-list
+--------------------------------------+--------------------+-------------------------------------+-------+----------------+
| id | agent_type | host | alive | admin_state_up |
+--------------------------------------+--------------------+-------------------------------------+-------+----------------+
| 1279aed3-791e-4b68-b467-921b9b02c06f | L3 agent | overcloud-notcompute0-62wihvoo4ajr | :-) | True |
| 8c993e95-3ec7-4774-ba15-eb894826fcbc | Metadata agent | overcloud-notcompute0-62wihvoo4ajr | :-) | True |
| a6edc712-725b-4c43-bd71-7d277df5a31c | Open vSwitch agent | overcloud-novacompute0-febd67yd6yid | :-) | True |
| c08148b4-5f5a-4144-82d7-6b39a99e1529 | DHCP agent | overcloud-notcompute0-62wihvoo4ajr | :-) | True |
| c38b6ce1-c80f-454f-8163-a4396246e1f1 | Open vSwitch agent | overcloud-notcompute0-62wihvoo4ajr | :-) | True |
+--------------------------------------+--------------------+-------------------------------------+-------+----------------+

So, what can account for a very slow start-up time, what loigs to check, etc?

I cxan confirm that now the agent has started the system is operational.

Revision history for this message
Robert Collins (lifeless) wrote :

clock skew could, or a very slow os-collect-config initial run - logs should tell you the latter and date the former.

Changed in tripleo:
status: Triaged → Incomplete
Derek Higgins (derekh)
Changed in tripleo:
importance: Critical → High
Revision history for this message
James Slagle (james-slagle) wrote :

please report back if you think clock skew could have been the cause here. If not, and no logs are available, are you able to reproduce the problem? I've bumped the priority down to High on this one.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for tripleo because there has been no activity for 60 days.]

Changed in tripleo:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.