neutron-bgp-agent needs to be restared after amqp relation

Bug #1784083 reported by David Ames on 2018-07-27
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
charm-neutron-dynamic-routing
High
Frode Nordahl

Bug Description

The xenial-queens bundle is intermittently failing. As seen here: https://review.openstack.org/#/c/584483/

It appears there is a race where the neutron-bgp-dragent is not getting restarted after receiving rabbitmq (amqp) connection data.
The zaza tests (appear to) fail because the agent is not talking to rabbitmq.

tox -e build
cd build/builds/neutron-dynamic-routing
Edit tests/tests.yaml
  Put xenial-queens-functional under the smoke_bundles heading
tox -e func-smoke

The problem is a race and intermittent so it may take a few attempts to catch it.

David Ames (thedac) on 2018-07-27
Changed in charm-neutron-dynamic-routing:
status: New → Confirmed
importance: Undecided → High
assignee: nobody → Nicolas Pochet (npochet)
Nicolas Pochet (npochet) on 2018-07-30
Changed in charm-neutron-dynamic-routing:
assignee: Nicolas Pochet (npochet) → nobody
Frode Nordahl (fnordahl) wrote :
Changed in charm-neutron-dynamic-routing:
status: Confirmed → In Progress
assignee: nobody → Frode Nordahl (fnordahl)
Frode Nordahl (fnordahl) wrote :

When the `neutron-bgp-dragent` service on the `neutron-dynamic-routing` unit gets ready before the `neutron-server` service on the `neutron-api` unit, it will start sending messages it will receive no reply on. These show up as MessageTimeout Tracebacks in the `neutron-bgp-dragent` log.

Some times the `neutron-bgp-dragent` does not get updates even after the `neutron-server` service is available and the MessageTimeout Tracebacks subsides.

Removing the speaker from the agent and adding it back resolves the issue.

I believe this error condition must be in Neutron itself. While we figure out exactly what is going on I propose we add a workaround to the test for the `neutron-dynamic-routing` charm.

Frode Nordahl (fnordahl) wrote :

Proposed workaround in functional test here: https://github.com/openstack-charmers/zaza/pull/89

Reviewed: https://review.openstack.org/588225
Committed: https://git.openstack.org/cgit/openstack/charm-neutron-dynamic-routing/commit/?id=1e2c088e2f89521e1f01971cfd563851e443ab2f
Submitter: Zuul
Branch: master

commit 1e2c088e2f89521e1f01971cfd563851e443ab2f
Author: Frode Nordahl <email address hidden>
Date: Thu Aug 2 13:41:05 2018 +0200

    Do not use default handler for `amqp.connected` flag

    The charm provides its own handler for appearance of `amqp.connected`
    flag.

    Add workaround for intermittent test failures to functional test.

    Depends-On: https://github.com/openstack-charmers/zaza/pull/89
    Closes-Bug: #1784083
    Change-Id: Iefa1c4bf2a447b8c6126a417887512cc10a1b78e

Changed in charm-neutron-dynamic-routing:
status: In Progress → Fix Committed
David Ames (thedac) on 2018-11-20
Changed in charm-neutron-dynamic-routing:
milestone: none → 19.04
David Ames (thedac) on 2019-04-17
Changed in charm-neutron-dynamic-routing:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers