HA : contrail-api throws an exception during rabbit reconnection and fails to connect back

Bug #1467000 reported by venu kolli
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
Undecided
Hampapur Ajay
R2.21.x
Fix Committed
Undecided
Hampapur Ajay
R2.22.x
Fix Committed
Undecided
Hampapur Ajay
Trunk
Fix Committed
Undecided
Hampapur Ajay

Bug Description

HA : contrail-api throws an exception during rabbit reconnection and fails to connect back during node failures.

Issue observed on R2.20 build 57 .

ERROR:a12c4s2:contrail-api:Config:0:__default__ [SYS_ERR]: VncApiError: RabbitMQ connection down
WARNING:a12c4s2:contrail-api:Config:0:__default__ [SYS_NOTICE]: VncApiError: RabbitMQ connection ESTABLISHED <Connection: amqp://guest@5.5.5.100:5673// at 0x7f52c6c2a650>
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
  File "/usr/lib/python2.7/dist-packages/cfgm_common/vnc_kombu.py", line 112, in _connection_watch
    self._reconnect()
  File "/usr/lib/python2.7/dist-packages/cfgm_common/vnc_kombu.py", line 99, in _reconnect
    callbacks=[self._subscribe])
  File "/usr/lib/python2.7/dist-packages/kombu/messaging.py", line 357, in __init__
    self.revive(self.channel)
  File "/usr/lib/python2.7/dist-packages/kombu/messaging.py", line 369, in revive
    self.declare()
  File "/usr/lib/python2.7/dist-packages/kombu/messaging.py", line 379, in declare
    queue.declare()
  File "/usr/lib/python2.7/dist-packages/kombu/entity.py", line 505, in declare
    self.queue_declare(nowait, passive=False)
  File "/usr/lib/python2.7/dist-packages/kombu/entity.py", line 531, in queue_declare
    nowait=nowait)
  File "/usr/lib/python2.7/dist-packages/amqp/channel.py", line 1258, in queue_declare
    (50, 11), # Channel.queue_declare_ok
  File "/usr/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 67, in wait
    self.channel_id, allowed_methods)
  File "/usr/lib/python2.7/dist-packages/amqp/connection.py", line 237, in _wait_method
    self.method_reader.read_method()
  File "/usr/lib/python2.7/dist-packages/amqp/method_framing.py", line 189, in read_method
    raise m
IOError: Socket closed
<Greenlet at 0x7f52c6c179b0: <bound method VncServerKombuClient._connection_watch of <vnc_cfg_api_server.vnc_cfg_ifmap.VncServerKombuClient object at 0x7f52c7dcb5d0>>> failed with IOError

Due to this issue it was not able to create a routing instance reference to virtual-network.

root@a12c3s4:~# curl -uadmin:contrail123 http://127.0.0.1:8095/virtual-network/0eb1b97d-ebfb-45ae-9df9-5c81b3ab485a | python -mjson.tool
  % Total % Received % Xferd Average Speed Time Time Time Current
                                 Dload Upload Total Spent Left Speed
100 1522 100 1522 0 0 217k 0 --:--:-- --:--:-- --:--:-- 247k
{
    "virtual-network": {
        "display_name": "vn1000",
        "fq_name": [
            "default-domain",
            "TestHANode-15391225",
            "vn1000"
        ],
        "href": "http://127.0.0.1:8095/virtual-network/0eb1b97d-ebfb-45ae-9df9-5c81b3ab485a",
        "id_perms": {
            "created": "2015-06-20T02:39:52.574968",
            "creator": null,
            "description": null,
            "enable": true,
            "last_modified": "2015-06-20T02:39:53.170435",
            "permissions": {
                "group": "admin",
                "group_access": 7,
                "other_access": 7,
                "owner": "admin",
                "owner_access": 7
            },
            "user_visible": true,
            "uuid": {
                "uuid_lslong": 11383231245290522714,
                "uuid_mslong": 1058831337889940910
            }
        },
        "is_shared": false,
        "name": "vn1000",
        "network_ipam_refs": [
            {
                "attr": {
                    "host_routes": null,
                    "ipam_subnets": [

                        {
                            "addr_from_start": true,
                            "allocation_pools": [],
                            "default_gateway": "20.1.1.1",
                            "dhcp_option_list": null,
                            "dns_nameservers": [],
                            "dns_server_address": "20.1.1.2",
                            "enable_dhcp": true,
                            "host_routes": null,
                            "subnet": {
                                "ip_prefix": "20.1.1.0",
                                "ip_prefix_len": 24
                            },
                            "subnet_name": "",
                            "subnet_uuid": "df014aec-8215-4a9b-8c52-2b584aea6227"
                        }
                    ]
                },
                "href": "http://127.0.0.1:8095/network-ipam/947f9a4d-c0e9-4856-9df5-4834ebe1df3b",
                "to": [
                    "default-domain",
                    "default-project",
                    "default-network-ipam"
                ],
                "uuid": "947f9a4d-c0e9-4856-9df5-4834ebe1df3b"
            }
        ],
        "parent_href": "http://127.0.0.1:8095/project/ebb3c3a2-e9bf-456c-8686-a175c86b6a1c",
        "parent_type": "project",
        "parent_uuid": "ebb3c3a2-e9bf-456c-8686-a175c86b6a1c",
        "route_target_list": {
            "route_target": [
                "target:64512:12345"
            ]
        },
        "router_external": false,
        "uuid": "0eb1b97d-ebfb-45ae-9df9-5c81b3ab485a"
    }
}
root@a12c3s4:~#

Tags: config
venu kolli (vkolli)
Changed in juniperopenstack:
assignee: nobody → Hampapur Ajay (hajay)
Revision history for this message
venu kolli (vkolli) wrote :

workaround for this issue is to restart contrail-api service

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/12644
Submitter: Hampapur Ajay (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/12644
Committed: http://github.org/Juniper/contrail-controller/commit/eac29147c8613f03dde50b812ef852b686d87c93
Submitter: Zuul
Branch: master

commit eac29147c8613f03dde50b812ef852b686d87c93
Author: Hampapur Ajay <email address hidden>
Date: Thu Jul 23 10:05:14 2015 -0700

config-resilience: Handle all rabbitmq producer/consumer reconnects

Improve connection handling with rabbit such that
1. The producer and consumer greenlets never die
2. Use context manager for semaphore and handle fail while wait
3. Log appropriately on these events.

Add unit tests to excercise these paths.

Closes-Bug: #1467000
Change-Id: If609a17b97039932d06ab70b40fee6dbdee624f3

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22-dev

Review in progress for https://review.opencontrail.org/13252
Submitter: Hampapur Ajay (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/13252
Committed: http://github.org/Juniper/contrail-controller/commit/86357f8532c158755a326a19919df6d686dd1b38
Submitter: Zuul
Branch: R2.22-dev

commit 86357f8532c158755a326a19919df6d686dd1b38
Author: Hampapur Ajay <email address hidden>
Date: Thu Jul 23 10:05:14 2015 -0700

config-resilience: Handle all rabbitmq producer/consumer reconnects

Improve connection handling with rabbit such that
1. The producer and consumer greenlets never die
2. Use context manager for semaphore and handle fail while wait
3. Log appropriately on these events.

Add unit tests to excercise these paths.

Closes-Bug: #1467000
Change-Id: If609a17b97039932d06ab70b40fee6dbdee624f3
(cherry picked from commit eac29147c8613f03dde50b812ef852b686d87c93)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.20

Review in progress for https://review.opencontrail.org/17204
Submitter: Suresh Balineni (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/17204
Committed: http://github.org/Juniper/contrail-controller/commit/014e76f5a062995b98a461a304fd90a9a44b0d9e
Submitter: Zuul
Branch: R2.20

commit 014e76f5a062995b98a461a304fd90a9a44b0d9e
Author: Hampapur Ajay <email address hidden>
Date: Thu Jul 23 10:05:14 2015 -0700

config-resilience: Handle all rabbitmq producer/consumer reconnects

Improve connection handling with rabbit such that
1. The producer and consumer greenlets never die
2. Use context manager for semaphore and handle fail while wait
3. Log appropriately on these events.

Add unit tests to excercise these paths.

Closes-Bug: #1467000
Change-Id: If609a17b97039932d06ab70b40fee6dbdee624f3
(cherry picked from commit eac29147c8613f03dde50b812ef852b686d87c93)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.22.x

Review in progress for https://review.opencontrail.org/19167
Submitter: Sachin Bansal (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R2.21.x

Review in progress for https://review.opencontrail.org/19169
Submitter: Sachin Bansal (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/19169
Committed: http://github.org/Juniper/contrail-controller/commit/e2815c2ed61e11b61eb39d54b27455a7fc16701b
Submitter: Zuul
Branch: R2.21.x

commit e2815c2ed61e11b61eb39d54b27455a7fc16701b
Author: Hampapur Ajay <email address hidden>
Date: Thu Jul 23 10:05:14 2015 -0700

config-resilience: Handle all rabbitmq producer/consumer reconnects

Improve connection handling with rabbit such that
1. The producer and consumer greenlets never die
2. Use context manager for semaphore and handle fail while wait
3. Log appropriately on these events.

Add unit tests to excercise these paths.

Closes-Bug: #1467000
Change-Id: If609a17b97039932d06ab70b40fee6dbdee624f3
(cherry picked from commit eac29147c8613f03dde50b812ef852b686d87c93)
(cherry picked from commit 014e76f5a062995b98a461a304fd90a9a44b0d9e)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/19167
Committed: http://github.org/Juniper/contrail-controller/commit/7a1d201e0f9c702b33695712822537f2bad2e3cf
Submitter: Zuul
Branch: R2.22.x

commit 7a1d201e0f9c702b33695712822537f2bad2e3cf
Author: Hampapur Ajay <email address hidden>
Date: Thu Jul 23 10:05:14 2015 -0700

config-resilience: Handle all rabbitmq producer/consumer reconnects

Improve connection handling with rabbit such that
1. The producer and consumer greenlets never die
2. Use context manager for semaphore and handle fail while wait
3. Log appropriately on these events.

Add unit tests to excercise these paths.

Closes-Bug: #1467000
Change-Id: If609a17b97039932d06ab70b40fee6dbdee624f3
(cherry picked from commit eac29147c8613f03dde50b812ef852b686d87c93)
(cherry picked from commit 014e76f5a062995b98a461a304fd90a9a44b0d9e)

information type: Proprietary → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.