neutron: infinite loop in port create (qpid/ML2 ovs) - Icehouse

Bug #1327127 reported by justlinux2010
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Unassigned

Bug Description

I set api_workers to 4 in /etc/neutron/neutron.conf to enable multi-process model. tenant_network_types is vxlan, core_plugin is ML2.
When i execute net-create and sunet-create commands, it works well. But When i execute port-create command, it never return.
The version of openstack is Icehouse.

Here is my configuration:
   file: /etc/neutron/neutron.conf
[DEFAULT]
verbose=True

auth_strategy=keystone

rpc_backend=neutron.openstack.common.rpc.impl_qpid
qpid_hostname=localhost

notify_nova_on_port_status_changes=True
notify_nova_on_port_data_changes=True

nova_url=http://localhost:8774/v2
nova_admin_username=nova
nova_admin_tenant_id=09f2bec6ba9e444a856da3f2b90ac25d
nova_admin_password=NOVA_PASS
nova_admin_auth_url=http://localhost:35357/v2.0

core_plugin=neutron.plugins.ml2.plugin.Ml2Plugin
service_plugins=neutron.services.l3_router.l3_router_plugin.L3RouterPlugin,neutron.services.firewall.fwaas_plugin.FirewallPlugin

api_workers = 4
#rpc_workers = 4

allow_overlapping_ips=True

[database]
connection=mysql://root@localhost/neutron

[keystone_authtoken]
auth_uri=http://localhost:5000
auth_host=localhost
auth_protocol=http
auth_port=35357
admin_tenant_name=service
admin_user=neutron
admin_password=NEUTRON_PASS

file:/etc/neutron/plugins/ml2/ml2.conf
[ml2]
type_drivers = vxlan
tenant_network_types = vxlan
mechanism_drivers = openvswitch

[ml2_type_vxlan]
vni_ranges = 2:10000

[securitygroup]
firewall_driver = neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver
enable_security_group = True

information type: Private Security → Public
Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Please provide more details about your configuration.
What release, core plugin are you using?
Are there neutron-server logs, neutron dhcp agent logs available?

Changed in neutron:
status: New → Incomplete
description: updated
Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Sorry, missed plugin and release in bug description. First port createion also triggers dhcp setup, so logs from dhcp agent is what can shed light on what is happening.

Revision history for this message
justlinux2010 (justlinux2010) wrote :

Thanks your reply, Eugene!
I have check the dhcp agent log, there is no useful information. The port that I want to create is not the first port, the previous ports are created when api_workers is 0.
I debug the code and find that it was looping in python-qpid package. The location is in /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py file:
class Connection(Endpoint):
......
   276
    277 @synchronized
    278 def attach(self, timeout=None):
    279 """
    280 Attach to the remote endpoint.
    281 """
    282 if not self._connected:
    283 self._connected = True
    284 self._driver.start()
    285 self._wakeup()
    286 if not self._ewait(lambda: self._transport_connected and not self._unlinked(), timeout=timeout):
    287 self.reconnect = False
    288 raise Timeout("Connection attach timed out")
......

The line is 286.

When i set api_workers to 0, it works well. I don't know what happened, please give me some advice.
Thanks

Revision history for this message
Matt Riedemann (mriedem) wrote :

We've had someone report the same issue, they are using latest Icehouse service pack with ML2 openvswitch in Neutron and qpid 0.26. They set api_workers=5 and boots were timing out on the nova side when trying to create ports.

Changed in neutron:
status: Incomplete → New
status: New → Confirmed
summary: - neutron: infinite loop in port create
+ neutron: infinite loop in port create (qpid/ML2 ovs) - Icehouse
Changed in neutron:
importance: Undecided → High
Revision history for this message
Matt Riedemann (mriedem) wrote :

Just looking at history, the last time the neutron/openstack/common/service.py module was synced from oslo-incubator in stable/icehouse was December 2013:

https://review.openstack.org/#/c/64419/

There doesn't appear to be anything interesting in service.py in oslo after that sync though.

Looking at the neutron/openstack/common/rpc changes on stable/icehouse and comparing to what's in oslo-incubator stable/icehouse, there is a neutron-only change for multiple rpc_workers:

https://github.com/openstack/neutron/commit/d925b8fb9cc312d1d9a19f651f268dc48cb26d26

That's not in oslo, so could be a problem.

There are two fixes cherry picked from oslo.messaging for neutron in stable/icehouse by Russell Bryant for qpid issues, which wouldn't have been in the Icehouse 2014.1 release, so make sure you have those:

https://github.com/openstack/neutron/commits/stable/icehouse/neutron/openstack/common/rpc

Are you on the latest service pack level?

Looks like this isn't in 2014.1.1:

https://github.com/openstack/neutron/commit/cac3aa8dbd57dded631f54353e54f42b3911e45d

So comparing the rpc code in oslo stable/icehouse to neutron stable/icehouse it looks like everything is in neutron, but not all in a release yet.

Also, what version of python-qpid are you using? Looking at https://issues.apache.org/jira/browse/QPID-5557, if you're using python-qpid 0.26 you could have issues.

Our team is using python-qpid 0.26 with patches for these qpid issues:

https://issues.apache.org/jira/browse/QPID-5773
https://issues.apache.org/jira/browse/QPID-5700

Revision history for this message
Matt Riedemann (mriedem) wrote :

Apparently this works around the issue:

https://issues.apache.org/jira/browse/QPID-5637

Revision history for this message
justlinux2010 (justlinux2010) wrote :

Thanks for all replies!
I will update our qpid.

Revision history for this message
justlinux2010 (justlinux2010) wrote :

hi,all
  I have upgrade the version of qpid to 0.28, but the problem still exists.
Can anyone help ?

Revision history for this message
justlinux2010 (justlinux2010) wrote :

Some patches have not merged into the release version 0.28(qpid-python)!
I upgrade to 0.29, it works.

Revision history for this message
Sukhdev Kapur (sukhdev-8) wrote :

It seems that this issue has been resolved. Please confirm.
If that is the case, can you please close this bug?

Thanks
-Sukhdev

Changed in neutron:
status: Confirmed → Fix Released
information type: Public → Public Security
information type: Public Security → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.