Neutron GRE fails in CentOS with timeout

Bug #1382529 reported by Matthew Mosesohn
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
Critical
MOS Neutron

Bug Description

Deployment:
Custom ISO #75
build_number: "75"
build_id: "2014-10-16_20-18-23"
astute_sha: "c3e7c7a18528cf9acca48021488a93dff74f5c97"
fuellib_sha: "aaffaab911edc5720be4bdc7e6369e6ab927d662"
ostf_sha: "de177931b53fbe9655502b73d03910b8118e25f1"
nailgun_sha: "0f6314e60808f84ab7eb86a1da57f91fd569f77a"
fuelmain_sha: "5cf06aac43ccb4a6031fbfa87ff9f9a729314daa"

CentOS HA with Neutron GRE

3 controllers with Cinder, 1 compute

Two errors keep coming up again and again:
First instance fails with: "Unavailable console type rdp-html5." but nova-api says:
api.log:2014-10-16 18:30:23.634 14412 TRACE nova.api.openstack ConnectionFailed: Connection to neutron failed: HTTPConnectionPool(host='192.168.0.1', port=9696): Max retries exceeded with url: /v2.0/floatingips.json?tenant_id=bde996a2f72a4e8e99f7a79eeec3a32d (Caused by <class 'socket.error'>: [Errno 111] ECONNREFUSED)

Neutron says:
server.log:2014-10-17 08:23:32.253 31833 TRACE keystonemiddleware.auth_token InvalidUserToken: Token authorization failed

nova-compute says:
TRACE nova.compute.manager ConnectionFailed: Connection to neutron failed: HTTPConnectionPool(host='192.168.0.1', port=9696): Request timed out. (timeout=30)

Tags: neutron juno
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :
description: updated
Revision history for this message
Alexander Ignatov (aignatov) wrote :

Need to check this behavior once all commits from 5.1 landed to 6.0/2014.2

tags: added: neutron
Changed in fuel:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
OSCI Robot (oscirobot) wrote :

Package neutron has been built from changeset: https://review.fuel-infra.org/27
RPM Repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-6.0-stable-27/centos

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :
Download full text (3.7 KiB)

After applying RPC handler from Oleg and the patch to sudoers, it still seems to fail. The latest error I get is:
2014-10-21 10:08:30.977 3224 ERROR oslo.messaging.rpc.dispatcher [req-5a325982-23cb-487a-9b89-14cd43f10cd6 ] Exception during message handling: UPDATE statement on table 'ports' expected to update 1 row(s); 0 were matched.
2014-10-21 10:08:30.977 3224 TRACE oslo.messaging.rpc.dispatcher StaleDataError: UPDATE statement on table 'ports' expected to update 1 row(s); 0 were matched.
2014-10-21 10:08:31.032 3224 ERROR oslo.messaging._drivers.common [req-5a325982-23cb-487a-9b89-14cd43f10cd6 ] Returning exception UPDATE statement on table 'ports' expected to update 1 row(s); 0 were matched. to caller
2014-10-21 10:08:31.036 3224 ERROR oslo.messaging._drivers.common [req-5a325982-23cb-487a-9b89-14cd43f10cd6 ] ['Traceback (most recent call last):\n', ' File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply\n incoming.message))\n', ' File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch\n return self._do_dispatch(endpoint, method, ctxt, args)\n', ' File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch\n result = getattr(endpoint, method)(ctxt, **new_args)\n', ' File "/usr/lib/python2.6/site-packages/neutron/plugins/ml2/rpc.py", line 138, in update_device_down\n host))\n', ' File "/usr/lib/python2.6/site-packages/neutron/plugins/ml2/plugin.py", line 1098, in update_port_status\n original_port[\'network_id\'])\n', ' File "/usr/lib/python2.6/site-packages/neutron/plugins/ml2/plugin.py", line 544, in get_network\n result = super(Ml2Plugin, self).get_network(context, id, None)\n', ' File "/usr/lib/python2.6/site-packages/neutron/db/db_base_plugin_v2.py", line 938, in get_network\n network = self._get_network(context, id)\n', ' File "/usr/lib/python2.6/site-packages/neutron/db/db_base_plugin_v2.py", line 92, in _get_network\n network = self._get_by_id(context, models_v2.Network, id)\n', ' File "/usr/lib/python2.6/site-packages/neutron/db/common_db_mixin.py", line 125, in _get_by_id\n return query.filter(model.id == id).one()\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py", line 2369, in one\n ret = list(self)\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/query.py", line 2411, in __iter__\n self.session._autoflush()\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py", line 1198, in _autoflush\n self.flush()\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py", line 1919, in flush\n self._flush(objects)\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py", line 2037, in _flush\n transaction.rollback(_capture_exception=True)\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/util/langhelpers.py", line 60, in __exit__\n compat.reraise(exc_type, exc_value, exc_tb)\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchemy/orm/session.py", line 2001, in _flush\n flush_context.execute()\n', ' File "/usr/lib64/python2.6/site-packages/sqlalchem...

Read more...

Changed in fuel:
importance: High → Critical
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

Noticing neutron oslo messaging issues more than issues specific to port creation now. Main issues are noticeable here:
 http://paste.openstack.org/show/ay9VYFgraDFJKIfttwbu/
2014-10-22 10:48:20.019 21212 ERROR oslo.messaging._drivers.impl_rabbit [-] Failed to publish message to topic 'reply_b7f3955599d04a61822d3bb32b3b85b9': Socket closed
2014-10-22 10:51:13.079 21211 ERROR oslo.messaging._drivers.impl_rabbit [req-a2a27b99-4577-40c3-9fe1-785e7c7e59da ] Failed to publish message to topic 'notifications.info': [Errno 32] Broken pipe

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

I'm waiting on the merge of new packages for these two bugs:
https://bugs.launchpad.net/bugs/1384667 - hivex for nova-compute
https://bugs.launchpad.net/bugs/1384702 - conntrack for neutron

I was able to pass OSTF 7 times in a row once I deployed with these 2 packages installed.
I'll test on the next staging iso after the packages are ready and be able to mark this as Fix Committed.

Changed in fuel:
assignee: MOS Neutron (mos-neutron) → Fuel OSCI Team (fuel-osci)
status: Confirmed → Triaged
Changed in fuel:
assignee: Fuel OSCI Team (fuel-osci) → MOS Neutron (mos-neutron)
Changed in fuel:
status: Triaged → Fix Committed
Revision history for this message
OSCI Robot (oscirobot) wrote :

Package neutron-2014.2-fuel6.0.mira7.git.8613bb6.c744f7b has been built from changeset: https://review.fuel-infra.org/27
RPM Repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-6.0-stable-27/centos

Revision history for this message
OSCI Robot (oscirobot) wrote :

Package neutron-2014.2-fuel6.0~mira6+git.8613bb6.c744f7b has been built from changeset: https://review.fuel-infra.org/27
DEB Repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-6.0-stable-27/ubuntu

Revision history for this message
OSCI Robot (oscirobot) wrote :

Package neutron-2014.2-fuel6.0.mira7 has been built from changeset: https://review.fuel-infra.org/27
RPM Repository URL: http://osci-obs.vm.mirantis.net:82/centos-fuel-6.0-stable/centos

Revision history for this message
OSCI Robot (oscirobot) wrote :

Package neutron-2014.2-fuel6.0~mira6 has been built from changeset: https://review.fuel-infra.org/27
DEB Repository URL: http://osci-obs.vm.mirantis.net:82/ubuntu-fuel-6.0-stable/ubuntu

Revision history for this message
Artem Panchenko (apanchenko-8) wrote :
Download full text (3.1 KiB)

api: '1.0'
astute_sha: 3c374c9f7bfbdbcd7ce2f716cd704e3044e6fb41
auth_required: true
build_id: 2014-11-13_03-28-01
build_number: '199'
feature_groups:
- experimental
fuellib_sha: bc3c6298f203af3c911a6c5aa11760331c9bdd60
fuelmain_sha: 144262798c38c9186ad2d9c71c60a795b1474014
nailgun_sha: 563749b01dd26e1e0978636efa3a0dafa5745fab
ostf_sha: 720cc1308c3a7081736edd167e7928ca61914aaa
production: docker
release: '6.0'

This issue was reproduced again on CI (BVT tests for community iso). Nova failed to spawn instance with floating IP.

Steps to reproduce:

1. Create new environment: CentOS + HA + NeutronVlan
2. Add 3 controllers, 2 computes. Deploy changes.
3. After successful deployment run OSTF test "Check network connectivity from instance via floating IP"

Expected result:

- environment passes the test

Actual:

- test fails (but not always)

Here is an error message from nova logs on controller:

/var/log/nova/conductor.log:2014-11-13 04:04:45.197 3003 ERROR nova.scheduler.utils [req-6e6d35bf-82fe-4bdb-a967-8f5863ade0e4 None] [instance: cd09a6fb-f062-4876-a2cf-80665479016b] Error from last host: node-2.test.domain.local (node node-2.test.domain.local): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 2014, in do_build_and_run_instance\n filter_properties)\n', u' File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 2149, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n', u"RescheduledException: Build of instance cd09a6fb-f062-4876-a2cf-80665479016b was re-scheduled: Connection to neutron failed: HTTPConnectionPool(host='10.108.27.2', port=9696): Request timed out. (timeout=30)\n"]

There were no connection errors to from HAProxy:

[root@node-3 ~]# echo "show stat" | nc -U /var/lib/haproxy/stats | awk -F',' '{if (NR == 1 || $1 ~/neutron/){printf "%14s %14s %10s %10s %10s %10s %10s %10s %10s %10s %10s\n", $1,$2,$8,$13,$14,$15,$18,$22,$23,$25,$28}}'
      # pxname svname stot ereq econ eresp status chkfail chkdown downtime iid
       neutron FRONTEND 162 0 OPEN 11
       neutron node-3 57 0 0 UP 0 0 0 11
       neutron node-4 53 0 0 UP 1 1 804 11
       neutron node-5 54 0 0 UP 1 1 798 11
       neutron BACKEND 164 0 0 UP 0 0 11

the part of Nova logs on compute node (node-2):

http://paste.openstack.org/show/132729/

errors from Neutron logs:

http://paste.openstack.org/show/132742/

installed Neutron packags info:

openstack-neutron-openvswitch-2014.2-fuel6.0.mira7.noarch
openstack-neutron-2014.2-fuel6.0.mira7.noarch
openstack-neutron-ml2-2014.2-fuel6.0.mira7.noarch
python-neutronclient-2.3.9-fuel6.0.mira21.noarch
python-neutron-2014.2-fuel6.0....

Read more...

Changed in fuel:
status: Fix Committed → Confirmed
Revision history for this message
Artem Panchenko (apanchenko-8) wrote :

please ignore my last comment, it relates to another bug - https://bugs.launchpad.net/mos/+bug/1338966

Changed in fuel:
status: Confirmed → Fix Committed
tags: added: on-verification
Revision history for this message
Alexander Kurenyshev (akurenyshev) wrote :

Verified on
{"build_id": "2014-12-18_01-32-01",
"ostf_sha": "a9afb68710d809570460c29d6c3293219d3624d4",
"build_number": "56",
"auth_required": true, "api": "1.0",
"nailgun_sha": "5f91157daa6798ff522ca9f6d34e7e135f150a90",
"production": "docker",
"fuelmain_sha": "45caacadb878abfbd9d60e134d72229698b469c9",
"astute_sha": "16b252d93be6aaa73030b8100cf8c5ca6a970a91",
"feature_groups": ["mirantis"], "release": "6.0",
"release_versions": {"2014.2-6.0": {"VERSION": {"build_id": "2014-12-18_01-32-01",
"ostf_sha": "a9afb68710d809570460c29d6c3293219d3624d4",
"build_number": "56",
"api": "1.0",
"nailgun_sha": "5f91157daa6798ff522ca9f6d34e7e135f150a90",
"production": "docker",
"fuelmain_sha": "45caacadb878abfbd9d60e134d72229698b469c9",
"astute_sha": "16b252d93be6aaa73030b8100cf8c5ca6a970a91",
"feature_groups": ["mirantis"], "release": "6.0",
"fuellib_sha": "73332192a257ea02c40a39885c502ad1ebdf3eda"}}}, "fuellib_sha": "73332192a257ea02c40a39885c502ad1ebdf3eda"}

Deploy was successful.
Grep on logs got nothing:
[root@node-1 ~]# grep "Unavailable console\|ConnectionFailed\|Errno 111\|Token authorization failed" /var/log/*
[root@node-1 ~]#

Changed in fuel:
status: Fix Committed → Fix Released
tags: removed: on-verification
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.