[10.0 swarm] Some haproxy backends are in down state

Bug #1674669 reported by Sergey Novikov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Confirmed
High
Sergey Novikov

Bug Description

Detailed bug description:

The issue was found by
https://product-ci.infra.mirantis.net/job/10.0.system_test.ubuntu.ha_neutron_destructive_vxlan/214/testReport/(root)/neutron_l3_migration_after_destroy_vxlan/

https://product-ci.infra.mirantis.net/job/10.0.system_test.ubuntu.ha_neutron_destructive_vxlan/214/testReport/(root)/neutron_l3_migration_after_reset_vxlan/

Steps to reproduce:
            1. Create cluster. HA, Neutron with VxLAN segmentation
            2. Add 3 nodes with controller roles
            3. Add 2 nodes with compute roles
            4. Add 1 node with cinder role
            5. Deploy the cluster
            6. Create an instance with a key pair
            7. Manually reschedule router from primary controller
               to another one
            8. Destroy/reset controller with l3-agent
            9. Check l3-agent was rescheduled
            10. Check network connectivity from instance via
               dhcp namespace
            11. Run OSTF

Actual result: OSTF check fails
AssertionError: Failed 1 OSTF tests; should fail 0 tests. Names of failed tests:
  - Check state of haproxy backends on controllers (failure)

Additional info http://paste.openstack.org/show/603593/

diagnostic snapshots:
https://drive.google.com/a/mirantis.com/file/d/0B29AewS4dQJobXVsYkNaRko0WDQ/view?usp=sharing
https://drive.google.com/a/mirantis.com/file/d/0B29AewS4dQJoQUpjVlZCcFhUeEU/view?usp=sharing

Tags: swarm-fail
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

StaleDataError: UPDATE statement on table 'standardattributes' expected to update 1 row(s); 0 were matched.

Changed in fuel:
assignee: nobody → MOS Neutron (mos-neutron)
milestone: 10.1 → 10.x-updates
status: New → Confirmed
Revision history for this message
Oleg Bondarev (obondarev) wrote :

StaleDataError has nothing to do with the state of haproxy backends on controllers.
in fact StaleDataError doesn't affect functionality even. It's a minor bug in neutron.

Changed in fuel:
assignee: MOS Neutron (mos-neutron) → Oleksiy Molchanov (omolchanov)
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

--------------------- >> end captured logging << ---------------------
2017-03-21 01:38:19 FAILURE Check network connectivity from instance via floating IP (fuel_health.tests.smoke.test_neutron_actions.TestNeutron.test_check_neutron_objects_creation) Time limit exceeded while waiting for removing floating IP to finish. Please refer to OpenStack logs for more details. File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 67, in testPartExecutor
   yield
 File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 601, in run
   testMethod()
 File "/usr/lib/python2.7/site-packages/fuel_health/tests/smoke/test_neutron_actions.py", line 128, in test_check_neutron_objects_creation
   "removing floating IP", server, floating_ip)
 File "/usr/lib/python2.7/site-packages/fuel_health/common/test_mixins.py", line 180, in verify
   " Please refer to OpenStack logs for more details.")
 File "/usr/lib/python2.7/site-packages/unittest2/case.py", line 666, in fail
   raise self.failureException(msg)
Step 11 failed: Time limit exceeded while waiting for removing floating IP to finish. Please refer to OpenStack logs for more details.

https://drive.google.com/a/mirantis.com/file/d/0B29AewS4dQJoQUpjVlZCcFhUeEU/view?usp=sharing

Changed in fuel:
assignee: Oleksiy Molchanov (omolchanov) → Fuel Sustaining (fuel-sustaining-team)
Changed in fuel:
milestone: 10.x-updates → 10.1
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

Looks like it was some environment related connectivity issue, all the resources were fine, except rabbitmq on one of the nodes:

   {error,{inconsistent_cluster,"Node 'rabbit@messaging-node-2' thinks it's clustered with node 'rabbit@messaging-node-5', but 'rabbit@messaging-node-5' disagrees"}}

Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

In this case - https://product-ci.infra.mirantis.net/job/10.0.system_test.ubuntu.ha_neutron_destructive_vxlan/220/

node-5 is gone offline, so backends marked as DOWN.

Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Fuel CI (fuel-ci)
Revision history for this message
Dmitry Kaigarodеsev (dkaiharodsev) wrote :

sorry folks, ci-team cannot help here

Changed in fuel:
assignee: Fuel CI (fuel-ci) → Sergey Novikov (snovikov)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.