Deploy cluster in HA mode with bonding failed on deployment failed with puppet errors on controllers - (OperationalError) (2003, "Can't connect to MySQL server on '10.108.82.2' (113)") None None (HTTP 500)

Bug #1332534 reported by Andrey Sledzinskiy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
Aleksandr Didenko

Bug Description

http://jenkins-product.srt.mirantis.net:8080/view/0_master_swarm/job/master_fuelmain.system_test.centos.bonding/44/testReport/junit/%28root%29/deploy_bonding_ha_active_backup/deploy_bonding_ha_active_backup/

Test steps:
1. Create cluster - ha, Neutron Vlan, other values as default
2. Add 3 nodes with controller role
3. Add 2 node with compute role
4. Setup bonding - all interfaces except eth0 with admin network bond in active-backup mode
5. Deploy the cluster

Actual result - deployment failed with errors on slave-02, slave-03 (node-4, node-5)
Errors on controllers in puppet log
(/Stage[main]/Swift::Keystone::Auth/Keystone_role[SwiftOperator]) Starting to evaluate the resource
2014-06-20T08:04:44.213106+01:00 debug: Executing '/usr/bin/keystone --os-token 4GZEFZgG --os-endpoint http://10.108.82.6:35357/v2.0/ role-list'
2014-06-20T08:04:47.606941+01:00 err: (/Stage[main]/Swift::Keystone::Auth/Keystone_role[SwiftOperator]) Could not evaluate: Execution of '/usr/bin/keystone --os-token 4GZEFZgG --os-endpoint http://10.108.82.6:35357/v2.0/ role-list' returned 1: An unexpected error prevented the server from fulfilling your request. (OperationalError) (2003, "Can't connect to MySQL server on '10.108.82.2' (113)") None None (HTTP 500)

Logs are attached

Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Aleksandr Didenko (adidenko)
Revision history for this message
Aleksandr Didenko (adidenko) wrote :

I've reproduced this on a virtual environment. So, here is how OVS bond got configured in my case on 3 controllers:

node-1 (10.108.4.3/24)
slave eth3: enabled
        active slave
        may_enable: true

node-2 (10.108.4.3/24)
slave eth4: enabled
        active slave
        may_enable: true

node-3 (10.108.4.4/24)
slave eth4: enabled
        active slave
        may_enable: true

node-2 and node-3 have eth4 as active bond slave. And they can access each other via bond interface:

[root@node-3 ~]# ping 10.108.4.4
PING 10.108.4.4 (10.108.4.4) 56(84) bytes of data.
64 bytes from 10.108.4.4: icmp_seq=1 ttl=64 time=0.026 ms
64 bytes from 10.108.4.4: icmp_seq=2 ttl=64 time=0.039 ms

But they can't ping node-1, because node-1 has different interface (eth3) as active bond slave:

[root@node-3 ~]# ping 10.108.4.2
PING 10.108.4.2 (10.108.4.2) 56(84) bytes of data.
From 10.108.4.4 icmp_seq=2 Destination Host Unreachable
From 10.108.4.4 icmp_seq=3 Destination Host Unreachable
From 10.108.4.4 icmp_seq=4 Destination Host Unreachable

[root@node-4 ~]# ping 10.108.4.2
PING 10.108.4.2 (10.108.4.2) 56(84) bytes of data.
From 10.108.4.4 icmp_seq=2 Destination Host Unreachable
From 10.108.4.4 icmp_seq=3 Destination Host Unreachable
From 10.108.4.4 icmp_seq=4 Destination Host Unreachable

Interconnection between node-1 and other controllers starts working as soon as we set active slave to the same interface on all controllers:

[root@node-1 ~]# ovs-appctl bond/set-active-slave ovs-bond0 eth4
done

[root@node-1 ~]# ping 10.108.2.4
PING 10.108.2.4 (10.108.2.4) 56(84) bytes of data.
64 bytes from 10.108.2.4: icmp_seq=1 ttl=64 time=1256 ms
64 bytes from 10.108.2.4: icmp_seq=2 ttl=64 time=256 ms
64 bytes from 10.108.2.4: icmp_seq=3 ttl=64 time=0.206 ms

[root@node-2 ~]# ping 10.108.4.2
PING 10.108.4.2 (10.108.4.2) 56(84) bytes of data.
64 bytes from 10.108.4.2: icmp_seq=1 ttl=64 time=0.472 ms
64 bytes from 10.108.4.2: icmp_seq=2 ttl=64 time=0.168 ms

I'm marking this bug as invalid, because this is environment configuration issue. In order to test NIC bonging, virtual or HW environment should be configured accordingly (it should support bonding for appropriate interfaces on a switch side).

Changed in fuel:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.