2015-08-18 18:27:08 |
Ryan Beisner |
bug |
|
|
added bug |
2015-08-18 18:27:45 |
Ryan Beisner |
summary |
cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-beis0-machine-2']: nodedown |
vivid-kilo 3-node native cluster race: cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-X-machine-N']: nodedown |
|
2015-08-18 19:21:54 |
Nobuto Murata |
bug |
|
|
added subscriber Nobuto Murata |
2015-08-19 13:23:13 |
Ryan Beisner |
summary |
vivid-kilo 3-node native cluster race: cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-X-machine-N']: nodedown |
3-node native cluster doesn't always cluster race: cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-X-machine-N']: nodedown |
|
2015-08-19 13:34:11 |
Ryan Beisner |
branch linked |
|
lp:~1chb1n/charms/trusty/rabbitmq-server/next.amulet-fix-20-delay |
|
2015-08-19 13:37:44 |
Ryan Beisner |
description |
With a 3-node native cluster in Vivid-Kilo, in greater than 50% of all attempts, one of the rabbitmq-server units fails to cluster. When this happens, we end up with a 2-node cluster, a 1-node cluster, while juju status indicates happiness.
Test scenario: a basic 3-node rabbitmq-server native cluster, with nrpe as a subordinate to exercise nrpe-external-master functionality, and with cinder to exercise and inspect amqp relation data.
This same test scenario clusters and succeeds consistently with Trusty-Icehouse, Trusty-Juno and Trusty-Kilo. There are similar, possibly-related issues on Precise-Icehouse, though the failure is different, so I'll raise that separately.
DNS does not appear to play a role here, as all machines can resolve all other machines, forward and reverse, when this cluster failure is observed.
FYI, when the cluster does succeed on V-K, a separate, seemingly-unrelated bug is consistently hit (bug 1485722).
# VK amulet results
2015-08-18 17:49:03,637 test_300_rmq_config INFO: OK
2015-08-18 17:49:03,637 test_400_rmq_cluster_running_nodes DEBUG: Checking that all units are in cluster_status running nodes...
2015-08-18 17:49:08,219 get_unit_hostnames DEBUG: Unit host names: {'rabbitmq-server/2': 'juju-beis0-machine-4', 'rabbitmq-server/0': 'juju-beis0-machine-2', 'rabbitmq-server/1': 'juju-beis0-machine-3'}
2015-08-18 17:49:09,932 run_cmd_unit DEBUG: rabbitmq-server/0 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:09,932 get_rmq_cluster_status DEBUG: rabbitmq-server/0 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-2' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-2']}]},
{running_nodes,['rabbit@juju-beis0-machine-2']},
{cluster_name,<<"rabbit@juju-beis0-machine-2.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:11,578 run_cmd_unit DEBUG: rabbitmq-server/1 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:11,578 get_rmq_cluster_status DEBUG: rabbitmq-server/1 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-3' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-4','rabbit@juju-beis0-machine-3']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:13,224 run_cmd_unit DEBUG: rabbitmq-server/2 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:13,226 get_rmq_cluster_status DEBUG: rabbitmq-server/2 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-4' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-3','rabbit@juju-beis0-machine-4']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-3 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-4 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/1: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-4', u'rabbit@juju-beis0-machine-3']
Cluster member check failed on rabbitmq-server/2: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-3', u'rabbit@juju-beis0-machine-4']
# VK rabbitmq-server/2 unit failed to cluster:
2015-08-18 17:44:27 INFO juju-log cluster:1: Clustering with remote rabbit host (juju-beis0-machine-2).
2015-08-18 17:44:27 INFO cluster-relation-changed Stopping node 'rabbit@juju-beis0-machine-4' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Clustering node 'rabbit@juju-beis0-machine-4' with 'rabbit@juju-beis0-machine-2' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-beis0-machine-2']: nodedown
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed DIAGNOSTICS
2015-08-18 17:44:28 INFO cluster-relation-changed ===========
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed attempted to contact: ['rabbit@juju-beis0-machine-2']
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed rabbit@juju-beis0-machine-2:
2015-08-18 17:44:28 INFO cluster-relation-changed * connected to epmd (port 4369) on juju-beis0-machine-2
2015-08-18 17:44:28 INFO cluster-relation-changed * epmd reports node 'rabbit' running on port 25672
2015-08-18 17:44:28 INFO cluster-relation-changed * TCP connection succeeded but Erlang distribution failed
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: hostname mismatch?
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: is the cookie set correctly?
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed current node details:
2015-08-18 17:44:28 INFO cluster-relation-changed - node name: 'rabbitmqctl-17379@juju-beis0-machine-4'
2015-08-18 17:44:28 INFO cluster-relation-changed - home dir: /var/lib/rabbitmq
2015-08-18 17:44:28 INFO cluster-relation-changed - cookie hash: j7UJuJx3ZktAni0tPfaRxw==
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO juju-log cluster:1: Failed to cluster with juju-beis0-machine-2.
# rabbitmq-server/2 (juju-beis0-machine-4)
Name resolution is fine. Attempted to cluster with rabbitmq-server/0 (juju-beis0-machine-2), failed. Clustered ok with rabbitmq-server/1 (juju-beis0-machine-3).
Full unit log: http://paste.ubuntu.com/12119674/
root@juju-beis0-machine-4:/var/log/juju# cat /etc/hostname
juju-beis0-machine-4
root@juju-beis0-machine-4:/var/log/juju# ip a | grep gl
inet 172.18.99.100/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/0 (juju-beis0-machine-2)
Name resolution is fine. cluster-releation-* hooks never fired.
Full unit log: http://paste.ubuntu.com/12119672/
root@juju-beis0-machine-2:/var/log/juju# cat /etc/hostname
juju-beis0-machine-2
root@juju-beis0-machine-2:/var/log/juju# ip a | grep gl
inet 172.18.99.98/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/1 (juju-beis0-machine-3)
Name resolution is fine. Clustered ok with rabbitmq-server/2 (juju-beis0-machine-4).
Full unit log: http://paste.ubuntu.com/12119695/
root@juju-beis0-machine-3:/var/log/juju# cat /etc/hostname
juju-beis0-machine-3
root@juju-beis0-machine-3:/var/log/juju# ip a | grep gl
inet 172.18.99.99/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# VK juju stat
http://paste.ubuntu.com/12119730/
# rmq verions
ubuntu@beisner-bastion:~/bzr/next/rabbitmq-server/tests$ juju run --service rabbitmq-server "apt-cache policy rabbitmq-server"
- MachineId: "2"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/0
- MachineId: "3"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/1
- MachineId: "4"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/2 |
With a 3-node native cluster in Vivid-Kilo, Trusty-Juno, and Precise-Icehouse, in greater than 50% of all attempts, one of the rabbitmq-server units fails to cluster. When this happens, we end up with a 2-node cluster, a 1-node cluster, while juju status indicates happiness.
Test scenario: a basic 3-node rabbitmq-server native cluster, with nrpe as a subordinate to exercise nrpe-external-master functionality, and with cinder to exercise and inspect amqp relation data.
This same test scenario clusters and succeeds consistently with Trusty-Icehouse, Trusty-Juno and Trusty-Kilo. There are similar, possibly-related issues on Precise-Icehouse, though the failure is different, so I'll raise that separately.
DNS does not appear to play a role here, as all machines can resolve all other machines, forward and reverse, when this cluster failure is observed.
FYI, when the cluster does succeed on V-K, a separate, seemingly-unrelated bug is consistently hit (bug 1485722).
# VK amulet results
2015-08-18 17:49:03,637 test_300_rmq_config INFO: OK
2015-08-18 17:49:03,637 test_400_rmq_cluster_running_nodes DEBUG: Checking that all units are in cluster_status running nodes...
2015-08-18 17:49:08,219 get_unit_hostnames DEBUG: Unit host names: {'rabbitmq-server/2': 'juju-beis0-machine-4', 'rabbitmq-server/0': 'juju-beis0-machine-2', 'rabbitmq-server/1': 'juju-beis0-machine-3'}
2015-08-18 17:49:09,932 run_cmd_unit DEBUG: rabbitmq-server/0 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:09,932 get_rmq_cluster_status DEBUG: rabbitmq-server/0 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-2' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-2']}]},
{running_nodes,['rabbit@juju-beis0-machine-2']},
{cluster_name,<<"rabbit@juju-beis0-machine-2.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:11,578 run_cmd_unit DEBUG: rabbitmq-server/1 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:11,578 get_rmq_cluster_status DEBUG: rabbitmq-server/1 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-3' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-4','rabbit@juju-beis0-machine-3']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:13,224 run_cmd_unit DEBUG: rabbitmq-server/2 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:13,226 get_rmq_cluster_status DEBUG: rabbitmq-server/2 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-4' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-3','rabbit@juju-beis0-machine-4']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-3 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-4 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/1: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-4', u'rabbit@juju-beis0-machine-3']
Cluster member check failed on rabbitmq-server/2: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-3', u'rabbit@juju-beis0-machine-4']
# VK rabbitmq-server/2 unit failed to cluster:
2015-08-18 17:44:27 INFO juju-log cluster:1: Clustering with remote rabbit host (juju-beis0-machine-2).
2015-08-18 17:44:27 INFO cluster-relation-changed Stopping node 'rabbit@juju-beis0-machine-4' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Clustering node 'rabbit@juju-beis0-machine-4' with 'rabbit@juju-beis0-machine-2' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-beis0-machine-2']: nodedown
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed DIAGNOSTICS
2015-08-18 17:44:28 INFO cluster-relation-changed ===========
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed attempted to contact: ['rabbit@juju-beis0-machine-2']
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed rabbit@juju-beis0-machine-2:
2015-08-18 17:44:28 INFO cluster-relation-changed * connected to epmd (port 4369) on juju-beis0-machine-2
2015-08-18 17:44:28 INFO cluster-relation-changed * epmd reports node 'rabbit' running on port 25672
2015-08-18 17:44:28 INFO cluster-relation-changed * TCP connection succeeded but Erlang distribution failed
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: hostname mismatch?
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: is the cookie set correctly?
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed current node details:
2015-08-18 17:44:28 INFO cluster-relation-changed - node name: 'rabbitmqctl-17379@juju-beis0-machine-4'
2015-08-18 17:44:28 INFO cluster-relation-changed - home dir: /var/lib/rabbitmq
2015-08-18 17:44:28 INFO cluster-relation-changed - cookie hash: j7UJuJx3ZktAni0tPfaRxw==
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO juju-log cluster:1: Failed to cluster with juju-beis0-machine-2.
# rabbitmq-server/2 (juju-beis0-machine-4)
Name resolution is fine. Attempted to cluster with rabbitmq-server/0 (juju-beis0-machine-2), failed. Clustered ok with rabbitmq-server/1 (juju-beis0-machine-3).
Full unit log: http://paste.ubuntu.com/12119674/
root@juju-beis0-machine-4:/var/log/juju# cat /etc/hostname
juju-beis0-machine-4
root@juju-beis0-machine-4:/var/log/juju# ip a | grep gl
inet 172.18.99.100/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/0 (juju-beis0-machine-2)
Name resolution is fine. cluster-releation-* hooks never fired.
Full unit log: http://paste.ubuntu.com/12119672/
root@juju-beis0-machine-2:/var/log/juju# cat /etc/hostname
juju-beis0-machine-2
root@juju-beis0-machine-2:/var/log/juju# ip a | grep gl
inet 172.18.99.98/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/1 (juju-beis0-machine-3)
Name resolution is fine. Clustered ok with rabbitmq-server/2 (juju-beis0-machine-4).
Full unit log: http://paste.ubuntu.com/12119695/
root@juju-beis0-machine-3:/var/log/juju# cat /etc/hostname
juju-beis0-machine-3
root@juju-beis0-machine-3:/var/log/juju# ip a | grep gl
inet 172.18.99.99/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# VK juju stat
http://paste.ubuntu.com/12119730/
# rmq verions
ubuntu@beisner-bastion:~/bzr/next/rabbitmq-server/tests$ juju run --service rabbitmq-server "apt-cache policy rabbitmq-server"
- MachineId: "2"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/0
- MachineId: "3"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/1
- MachineId: "4"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/2 |
|
2015-08-19 13:38:08 |
Ryan Beisner |
description |
With a 3-node native cluster in Vivid-Kilo, Trusty-Juno, and Precise-Icehouse, in greater than 50% of all attempts, one of the rabbitmq-server units fails to cluster. When this happens, we end up with a 2-node cluster, a 1-node cluster, while juju status indicates happiness.
Test scenario: a basic 3-node rabbitmq-server native cluster, with nrpe as a subordinate to exercise nrpe-external-master functionality, and with cinder to exercise and inspect amqp relation data.
This same test scenario clusters and succeeds consistently with Trusty-Icehouse, Trusty-Juno and Trusty-Kilo. There are similar, possibly-related issues on Precise-Icehouse, though the failure is different, so I'll raise that separately.
DNS does not appear to play a role here, as all machines can resolve all other machines, forward and reverse, when this cluster failure is observed.
FYI, when the cluster does succeed on V-K, a separate, seemingly-unrelated bug is consistently hit (bug 1485722).
# VK amulet results
2015-08-18 17:49:03,637 test_300_rmq_config INFO: OK
2015-08-18 17:49:03,637 test_400_rmq_cluster_running_nodes DEBUG: Checking that all units are in cluster_status running nodes...
2015-08-18 17:49:08,219 get_unit_hostnames DEBUG: Unit host names: {'rabbitmq-server/2': 'juju-beis0-machine-4', 'rabbitmq-server/0': 'juju-beis0-machine-2', 'rabbitmq-server/1': 'juju-beis0-machine-3'}
2015-08-18 17:49:09,932 run_cmd_unit DEBUG: rabbitmq-server/0 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:09,932 get_rmq_cluster_status DEBUG: rabbitmq-server/0 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-2' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-2']}]},
{running_nodes,['rabbit@juju-beis0-machine-2']},
{cluster_name,<<"rabbit@juju-beis0-machine-2.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:11,578 run_cmd_unit DEBUG: rabbitmq-server/1 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:11,578 get_rmq_cluster_status DEBUG: rabbitmq-server/1 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-3' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-4','rabbit@juju-beis0-machine-3']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:13,224 run_cmd_unit DEBUG: rabbitmq-server/2 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:13,226 get_rmq_cluster_status DEBUG: rabbitmq-server/2 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-4' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-3','rabbit@juju-beis0-machine-4']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-3 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-4 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/1: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-4', u'rabbit@juju-beis0-machine-3']
Cluster member check failed on rabbitmq-server/2: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-3', u'rabbit@juju-beis0-machine-4']
# VK rabbitmq-server/2 unit failed to cluster:
2015-08-18 17:44:27 INFO juju-log cluster:1: Clustering with remote rabbit host (juju-beis0-machine-2).
2015-08-18 17:44:27 INFO cluster-relation-changed Stopping node 'rabbit@juju-beis0-machine-4' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Clustering node 'rabbit@juju-beis0-machine-4' with 'rabbit@juju-beis0-machine-2' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-beis0-machine-2']: nodedown
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed DIAGNOSTICS
2015-08-18 17:44:28 INFO cluster-relation-changed ===========
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed attempted to contact: ['rabbit@juju-beis0-machine-2']
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed rabbit@juju-beis0-machine-2:
2015-08-18 17:44:28 INFO cluster-relation-changed * connected to epmd (port 4369) on juju-beis0-machine-2
2015-08-18 17:44:28 INFO cluster-relation-changed * epmd reports node 'rabbit' running on port 25672
2015-08-18 17:44:28 INFO cluster-relation-changed * TCP connection succeeded but Erlang distribution failed
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: hostname mismatch?
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: is the cookie set correctly?
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed current node details:
2015-08-18 17:44:28 INFO cluster-relation-changed - node name: 'rabbitmqctl-17379@juju-beis0-machine-4'
2015-08-18 17:44:28 INFO cluster-relation-changed - home dir: /var/lib/rabbitmq
2015-08-18 17:44:28 INFO cluster-relation-changed - cookie hash: j7UJuJx3ZktAni0tPfaRxw==
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO juju-log cluster:1: Failed to cluster with juju-beis0-machine-2.
# rabbitmq-server/2 (juju-beis0-machine-4)
Name resolution is fine. Attempted to cluster with rabbitmq-server/0 (juju-beis0-machine-2), failed. Clustered ok with rabbitmq-server/1 (juju-beis0-machine-3).
Full unit log: http://paste.ubuntu.com/12119674/
root@juju-beis0-machine-4:/var/log/juju# cat /etc/hostname
juju-beis0-machine-4
root@juju-beis0-machine-4:/var/log/juju# ip a | grep gl
inet 172.18.99.100/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/0 (juju-beis0-machine-2)
Name resolution is fine. cluster-releation-* hooks never fired.
Full unit log: http://paste.ubuntu.com/12119672/
root@juju-beis0-machine-2:/var/log/juju# cat /etc/hostname
juju-beis0-machine-2
root@juju-beis0-machine-2:/var/log/juju# ip a | grep gl
inet 172.18.99.98/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/1 (juju-beis0-machine-3)
Name resolution is fine. Clustered ok with rabbitmq-server/2 (juju-beis0-machine-4).
Full unit log: http://paste.ubuntu.com/12119695/
root@juju-beis0-machine-3:/var/log/juju# cat /etc/hostname
juju-beis0-machine-3
root@juju-beis0-machine-3:/var/log/juju# ip a | grep gl
inet 172.18.99.99/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# VK juju stat
http://paste.ubuntu.com/12119730/
# rmq verions
ubuntu@beisner-bastion:~/bzr/next/rabbitmq-server/tests$ juju run --service rabbitmq-server "apt-cache policy rabbitmq-server"
- MachineId: "2"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/0
- MachineId: "3"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/1
- MachineId: "4"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/2 |
With a 3-node native cluster in Vivid-Kilo, Trusty-Juno, and Precise-Icehouse, in greater than 50% of all attempts, one of the rabbitmq-server units fails to cluster. When this happens, we end up with a 2-node cluster, a 1-node cluster, while juju status indicates happiness.
Test scenario: a basic 3-node rabbitmq-server native cluster, with nrpe as a subordinate to exercise nrpe-external-master functionality, and with cinder to exercise and inspect amqp relation data.
This same test scenario clusters and succeeds consistently with Trusty-Icehouse and Trusty-Kilo. There are similar, possibly-related issues on Precise-Icehouse, though the failure is different, so I'll raise that separately.
DNS does not appear to play a role here, as all machines can resolve all other machines, forward and reverse, when this cluster failure is observed.
FYI, when the cluster does succeed on V-K, a separate, seemingly-unrelated bug is consistently hit (bug 1485722).
# VK amulet results
2015-08-18 17:49:03,637 test_300_rmq_config INFO: OK
2015-08-18 17:49:03,637 test_400_rmq_cluster_running_nodes DEBUG: Checking that all units are in cluster_status running nodes...
2015-08-18 17:49:08,219 get_unit_hostnames DEBUG: Unit host names: {'rabbitmq-server/2': 'juju-beis0-machine-4', 'rabbitmq-server/0': 'juju-beis0-machine-2', 'rabbitmq-server/1': 'juju-beis0-machine-3'}
2015-08-18 17:49:09,932 run_cmd_unit DEBUG: rabbitmq-server/0 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:09,932 get_rmq_cluster_status DEBUG: rabbitmq-server/0 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-2' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-2']}]},
{running_nodes,['rabbit@juju-beis0-machine-2']},
{cluster_name,<<"rabbit@juju-beis0-machine-2.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:11,578 run_cmd_unit DEBUG: rabbitmq-server/1 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:11,578 get_rmq_cluster_status DEBUG: rabbitmq-server/1 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-3' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-4','rabbit@juju-beis0-machine-3']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:13,224 run_cmd_unit DEBUG: rabbitmq-server/2 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:13,226 get_rmq_cluster_status DEBUG: rabbitmq-server/2 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-4' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-3','rabbit@juju-beis0-machine-4']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-3 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-4 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/1: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-4', u'rabbit@juju-beis0-machine-3']
Cluster member check failed on rabbitmq-server/2: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-3', u'rabbit@juju-beis0-machine-4']
# VK rabbitmq-server/2 unit failed to cluster:
2015-08-18 17:44:27 INFO juju-log cluster:1: Clustering with remote rabbit host (juju-beis0-machine-2).
2015-08-18 17:44:27 INFO cluster-relation-changed Stopping node 'rabbit@juju-beis0-machine-4' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Clustering node 'rabbit@juju-beis0-machine-4' with 'rabbit@juju-beis0-machine-2' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-beis0-machine-2']: nodedown
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed DIAGNOSTICS
2015-08-18 17:44:28 INFO cluster-relation-changed ===========
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed attempted to contact: ['rabbit@juju-beis0-machine-2']
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed rabbit@juju-beis0-machine-2:
2015-08-18 17:44:28 INFO cluster-relation-changed * connected to epmd (port 4369) on juju-beis0-machine-2
2015-08-18 17:44:28 INFO cluster-relation-changed * epmd reports node 'rabbit' running on port 25672
2015-08-18 17:44:28 INFO cluster-relation-changed * TCP connection succeeded but Erlang distribution failed
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: hostname mismatch?
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: is the cookie set correctly?
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed current node details:
2015-08-18 17:44:28 INFO cluster-relation-changed - node name: 'rabbitmqctl-17379@juju-beis0-machine-4'
2015-08-18 17:44:28 INFO cluster-relation-changed - home dir: /var/lib/rabbitmq
2015-08-18 17:44:28 INFO cluster-relation-changed - cookie hash: j7UJuJx3ZktAni0tPfaRxw==
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO juju-log cluster:1: Failed to cluster with juju-beis0-machine-2.
# rabbitmq-server/2 (juju-beis0-machine-4)
Name resolution is fine. Attempted to cluster with rabbitmq-server/0 (juju-beis0-machine-2), failed. Clustered ok with rabbitmq-server/1 (juju-beis0-machine-3).
Full unit log: http://paste.ubuntu.com/12119674/
root@juju-beis0-machine-4:/var/log/juju# cat /etc/hostname
juju-beis0-machine-4
root@juju-beis0-machine-4:/var/log/juju# ip a | grep gl
inet 172.18.99.100/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/0 (juju-beis0-machine-2)
Name resolution is fine. cluster-releation-* hooks never fired.
Full unit log: http://paste.ubuntu.com/12119672/
root@juju-beis0-machine-2:/var/log/juju# cat /etc/hostname
juju-beis0-machine-2
root@juju-beis0-machine-2:/var/log/juju# ip a | grep gl
inet 172.18.99.98/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/1 (juju-beis0-machine-3)
Name resolution is fine. Clustered ok with rabbitmq-server/2 (juju-beis0-machine-4).
Full unit log: http://paste.ubuntu.com/12119695/
root@juju-beis0-machine-3:/var/log/juju# cat /etc/hostname
juju-beis0-machine-3
root@juju-beis0-machine-3:/var/log/juju# ip a | grep gl
inet 172.18.99.99/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# VK juju stat
http://paste.ubuntu.com/12119730/
# rmq verions
ubuntu@beisner-bastion:~/bzr/next/rabbitmq-server/tests$ juju run --service rabbitmq-server "apt-cache policy rabbitmq-server"
- MachineId: "2"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/0
- MachineId: "3"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/1
- MachineId: "4"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/2 |
|
2015-08-19 13:38:22 |
Ryan Beisner |
description |
With a 3-node native cluster in Vivid-Kilo, Trusty-Juno, and Precise-Icehouse, in greater than 50% of all attempts, one of the rabbitmq-server units fails to cluster. When this happens, we end up with a 2-node cluster, a 1-node cluster, while juju status indicates happiness.
Test scenario: a basic 3-node rabbitmq-server native cluster, with nrpe as a subordinate to exercise nrpe-external-master functionality, and with cinder to exercise and inspect amqp relation data.
This same test scenario clusters and succeeds consistently with Trusty-Icehouse and Trusty-Kilo. There are similar, possibly-related issues on Precise-Icehouse, though the failure is different, so I'll raise that separately.
DNS does not appear to play a role here, as all machines can resolve all other machines, forward and reverse, when this cluster failure is observed.
FYI, when the cluster does succeed on V-K, a separate, seemingly-unrelated bug is consistently hit (bug 1485722).
# VK amulet results
2015-08-18 17:49:03,637 test_300_rmq_config INFO: OK
2015-08-18 17:49:03,637 test_400_rmq_cluster_running_nodes DEBUG: Checking that all units are in cluster_status running nodes...
2015-08-18 17:49:08,219 get_unit_hostnames DEBUG: Unit host names: {'rabbitmq-server/2': 'juju-beis0-machine-4', 'rabbitmq-server/0': 'juju-beis0-machine-2', 'rabbitmq-server/1': 'juju-beis0-machine-3'}
2015-08-18 17:49:09,932 run_cmd_unit DEBUG: rabbitmq-server/0 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:09,932 get_rmq_cluster_status DEBUG: rabbitmq-server/0 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-2' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-2']}]},
{running_nodes,['rabbit@juju-beis0-machine-2']},
{cluster_name,<<"rabbit@juju-beis0-machine-2.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:11,578 run_cmd_unit DEBUG: rabbitmq-server/1 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:11,578 get_rmq_cluster_status DEBUG: rabbitmq-server/1 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-3' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-4','rabbit@juju-beis0-machine-3']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:13,224 run_cmd_unit DEBUG: rabbitmq-server/2 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:13,226 get_rmq_cluster_status DEBUG: rabbitmq-server/2 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-4' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-3','rabbit@juju-beis0-machine-4']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-3 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-4 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/1: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-4', u'rabbit@juju-beis0-machine-3']
Cluster member check failed on rabbitmq-server/2: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-3', u'rabbit@juju-beis0-machine-4']
# VK rabbitmq-server/2 unit failed to cluster:
2015-08-18 17:44:27 INFO juju-log cluster:1: Clustering with remote rabbit host (juju-beis0-machine-2).
2015-08-18 17:44:27 INFO cluster-relation-changed Stopping node 'rabbit@juju-beis0-machine-4' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Clustering node 'rabbit@juju-beis0-machine-4' with 'rabbit@juju-beis0-machine-2' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-beis0-machine-2']: nodedown
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed DIAGNOSTICS
2015-08-18 17:44:28 INFO cluster-relation-changed ===========
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed attempted to contact: ['rabbit@juju-beis0-machine-2']
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed rabbit@juju-beis0-machine-2:
2015-08-18 17:44:28 INFO cluster-relation-changed * connected to epmd (port 4369) on juju-beis0-machine-2
2015-08-18 17:44:28 INFO cluster-relation-changed * epmd reports node 'rabbit' running on port 25672
2015-08-18 17:44:28 INFO cluster-relation-changed * TCP connection succeeded but Erlang distribution failed
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: hostname mismatch?
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: is the cookie set correctly?
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed current node details:
2015-08-18 17:44:28 INFO cluster-relation-changed - node name: 'rabbitmqctl-17379@juju-beis0-machine-4'
2015-08-18 17:44:28 INFO cluster-relation-changed - home dir: /var/lib/rabbitmq
2015-08-18 17:44:28 INFO cluster-relation-changed - cookie hash: j7UJuJx3ZktAni0tPfaRxw==
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO juju-log cluster:1: Failed to cluster with juju-beis0-machine-2.
# rabbitmq-server/2 (juju-beis0-machine-4)
Name resolution is fine. Attempted to cluster with rabbitmq-server/0 (juju-beis0-machine-2), failed. Clustered ok with rabbitmq-server/1 (juju-beis0-machine-3).
Full unit log: http://paste.ubuntu.com/12119674/
root@juju-beis0-machine-4:/var/log/juju# cat /etc/hostname
juju-beis0-machine-4
root@juju-beis0-machine-4:/var/log/juju# ip a | grep gl
inet 172.18.99.100/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/0 (juju-beis0-machine-2)
Name resolution is fine. cluster-releation-* hooks never fired.
Full unit log: http://paste.ubuntu.com/12119672/
root@juju-beis0-machine-2:/var/log/juju# cat /etc/hostname
juju-beis0-machine-2
root@juju-beis0-machine-2:/var/log/juju# ip a | grep gl
inet 172.18.99.98/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/1 (juju-beis0-machine-3)
Name resolution is fine. Clustered ok with rabbitmq-server/2 (juju-beis0-machine-4).
Full unit log: http://paste.ubuntu.com/12119695/
root@juju-beis0-machine-3:/var/log/juju# cat /etc/hostname
juju-beis0-machine-3
root@juju-beis0-machine-3:/var/log/juju# ip a | grep gl
inet 172.18.99.99/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# VK juju stat
http://paste.ubuntu.com/12119730/
# rmq verions
ubuntu@beisner-bastion:~/bzr/next/rabbitmq-server/tests$ juju run --service rabbitmq-server "apt-cache policy rabbitmq-server"
- MachineId: "2"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/0
- MachineId: "3"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/1
- MachineId: "4"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/2 |
With a 3-node native cluster in Vivid-Kilo, Trusty-Juno, and Precise-Icehouse, in greater than 50% of all attempts, one of the rabbitmq-server units fails to cluster. When this happens, we end up with a 2-node cluster, a 1-node cluster, while juju status indicates happiness.
Test scenario: a basic 3-node rabbitmq-server native cluster, with nrpe as a subordinate to exercise nrpe-external-master functionality, and with cinder to exercise and inspect amqp relation data.
This same test scenario clusters and succeeds consistently with Trusty-Icehouse and Trusty-Kilo.
DNS does not appear to play a role here, as all machines can resolve all other machines, forward and reverse, when this cluster failure is observed.
FYI, when the cluster does succeed on V-K, a separate, seemingly-unrelated bug is consistently hit (bug 1485722).
# VK amulet results
2015-08-18 17:49:03,637 test_300_rmq_config INFO: OK
2015-08-18 17:49:03,637 test_400_rmq_cluster_running_nodes DEBUG: Checking that all units are in cluster_status running nodes...
2015-08-18 17:49:08,219 get_unit_hostnames DEBUG: Unit host names: {'rabbitmq-server/2': 'juju-beis0-machine-4', 'rabbitmq-server/0': 'juju-beis0-machine-2', 'rabbitmq-server/1': 'juju-beis0-machine-3'}
2015-08-18 17:49:09,932 run_cmd_unit DEBUG: rabbitmq-server/0 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:09,932 get_rmq_cluster_status DEBUG: rabbitmq-server/0 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-2' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-2']}]},
{running_nodes,['rabbit@juju-beis0-machine-2']},
{cluster_name,<<"rabbit@juju-beis0-machine-2.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:11,578 run_cmd_unit DEBUG: rabbitmq-server/1 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:11,578 get_rmq_cluster_status DEBUG: rabbitmq-server/1 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-3' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-4','rabbit@juju-beis0-machine-3']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:13,224 run_cmd_unit DEBUG: rabbitmq-server/2 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:13,226 get_rmq_cluster_status DEBUG: rabbitmq-server/2 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-4' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-3','rabbit@juju-beis0-machine-4']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-3 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-4 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/1: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-4', u'rabbit@juju-beis0-machine-3']
Cluster member check failed on rabbitmq-server/2: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-3', u'rabbit@juju-beis0-machine-4']
# VK rabbitmq-server/2 unit failed to cluster:
2015-08-18 17:44:27 INFO juju-log cluster:1: Clustering with remote rabbit host (juju-beis0-machine-2).
2015-08-18 17:44:27 INFO cluster-relation-changed Stopping node 'rabbit@juju-beis0-machine-4' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Clustering node 'rabbit@juju-beis0-machine-4' with 'rabbit@juju-beis0-machine-2' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-beis0-machine-2']: nodedown
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed DIAGNOSTICS
2015-08-18 17:44:28 INFO cluster-relation-changed ===========
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed attempted to contact: ['rabbit@juju-beis0-machine-2']
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed rabbit@juju-beis0-machine-2:
2015-08-18 17:44:28 INFO cluster-relation-changed * connected to epmd (port 4369) on juju-beis0-machine-2
2015-08-18 17:44:28 INFO cluster-relation-changed * epmd reports node 'rabbit' running on port 25672
2015-08-18 17:44:28 INFO cluster-relation-changed * TCP connection succeeded but Erlang distribution failed
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: hostname mismatch?
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: is the cookie set correctly?
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed current node details:
2015-08-18 17:44:28 INFO cluster-relation-changed - node name: 'rabbitmqctl-17379@juju-beis0-machine-4'
2015-08-18 17:44:28 INFO cluster-relation-changed - home dir: /var/lib/rabbitmq
2015-08-18 17:44:28 INFO cluster-relation-changed - cookie hash: j7UJuJx3ZktAni0tPfaRxw==
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO juju-log cluster:1: Failed to cluster with juju-beis0-machine-2.
# rabbitmq-server/2 (juju-beis0-machine-4)
Name resolution is fine. Attempted to cluster with rabbitmq-server/0 (juju-beis0-machine-2), failed. Clustered ok with rabbitmq-server/1 (juju-beis0-machine-3).
Full unit log: http://paste.ubuntu.com/12119674/
root@juju-beis0-machine-4:/var/log/juju# cat /etc/hostname
juju-beis0-machine-4
root@juju-beis0-machine-4:/var/log/juju# ip a | grep gl
inet 172.18.99.100/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/0 (juju-beis0-machine-2)
Name resolution is fine. cluster-releation-* hooks never fired.
Full unit log: http://paste.ubuntu.com/12119672/
root@juju-beis0-machine-2:/var/log/juju# cat /etc/hostname
juju-beis0-machine-2
root@juju-beis0-machine-2:/var/log/juju# ip a | grep gl
inet 172.18.99.98/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/1 (juju-beis0-machine-3)
Name resolution is fine. Clustered ok with rabbitmq-server/2 (juju-beis0-machine-4).
Full unit log: http://paste.ubuntu.com/12119695/
root@juju-beis0-machine-3:/var/log/juju# cat /etc/hostname
juju-beis0-machine-3
root@juju-beis0-machine-3:/var/log/juju# ip a | grep gl
inet 172.18.99.99/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# VK juju stat
http://paste.ubuntu.com/12119730/
# rmq verions
ubuntu@beisner-bastion:~/bzr/next/rabbitmq-server/tests$ juju run --service rabbitmq-server "apt-cache policy rabbitmq-server"
- MachineId: "2"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/0
- MachineId: "3"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/1
- MachineId: "4"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/2 |
|
2015-08-20 15:18:10 |
Liam Young |
rabbitmq-server (Juju Charms Collection): status |
New |
Confirmed |
|
2015-08-20 15:18:12 |
Liam Young |
rabbitmq-server (Juju Charms Collection): importance |
Undecided |
High |
|
2015-08-20 15:24:01 |
Liam Young |
rabbitmq-server (Juju Charms Collection): importance |
High |
Critical |
|
2015-08-20 15:32:43 |
Ryan Beisner |
description |
With a 3-node native cluster in Vivid-Kilo, Trusty-Juno, and Precise-Icehouse, in greater than 50% of all attempts, one of the rabbitmq-server units fails to cluster. When this happens, we end up with a 2-node cluster, a 1-node cluster, while juju status indicates happiness.
Test scenario: a basic 3-node rabbitmq-server native cluster, with nrpe as a subordinate to exercise nrpe-external-master functionality, and with cinder to exercise and inspect amqp relation data.
This same test scenario clusters and succeeds consistently with Trusty-Icehouse and Trusty-Kilo.
DNS does not appear to play a role here, as all machines can resolve all other machines, forward and reverse, when this cluster failure is observed.
FYI, when the cluster does succeed on V-K, a separate, seemingly-unrelated bug is consistently hit (bug 1485722).
# VK amulet results
2015-08-18 17:49:03,637 test_300_rmq_config INFO: OK
2015-08-18 17:49:03,637 test_400_rmq_cluster_running_nodes DEBUG: Checking that all units are in cluster_status running nodes...
2015-08-18 17:49:08,219 get_unit_hostnames DEBUG: Unit host names: {'rabbitmq-server/2': 'juju-beis0-machine-4', 'rabbitmq-server/0': 'juju-beis0-machine-2', 'rabbitmq-server/1': 'juju-beis0-machine-3'}
2015-08-18 17:49:09,932 run_cmd_unit DEBUG: rabbitmq-server/0 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:09,932 get_rmq_cluster_status DEBUG: rabbitmq-server/0 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-2' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-2']}]},
{running_nodes,['rabbit@juju-beis0-machine-2']},
{cluster_name,<<"rabbit@juju-beis0-machine-2.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:11,578 run_cmd_unit DEBUG: rabbitmq-server/1 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:11,578 get_rmq_cluster_status DEBUG: rabbitmq-server/1 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-3' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-4','rabbit@juju-beis0-machine-3']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:13,224 run_cmd_unit DEBUG: rabbitmq-server/2 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:13,226 get_rmq_cluster_status DEBUG: rabbitmq-server/2 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-4' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-3','rabbit@juju-beis0-machine-4']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-3 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-4 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/1: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-4', u'rabbit@juju-beis0-machine-3']
Cluster member check failed on rabbitmq-server/2: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-3', u'rabbit@juju-beis0-machine-4']
# VK rabbitmq-server/2 unit failed to cluster:
2015-08-18 17:44:27 INFO juju-log cluster:1: Clustering with remote rabbit host (juju-beis0-machine-2).
2015-08-18 17:44:27 INFO cluster-relation-changed Stopping node 'rabbit@juju-beis0-machine-4' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Clustering node 'rabbit@juju-beis0-machine-4' with 'rabbit@juju-beis0-machine-2' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-beis0-machine-2']: nodedown
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed DIAGNOSTICS
2015-08-18 17:44:28 INFO cluster-relation-changed ===========
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed attempted to contact: ['rabbit@juju-beis0-machine-2']
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed rabbit@juju-beis0-machine-2:
2015-08-18 17:44:28 INFO cluster-relation-changed * connected to epmd (port 4369) on juju-beis0-machine-2
2015-08-18 17:44:28 INFO cluster-relation-changed * epmd reports node 'rabbit' running on port 25672
2015-08-18 17:44:28 INFO cluster-relation-changed * TCP connection succeeded but Erlang distribution failed
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: hostname mismatch?
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: is the cookie set correctly?
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed current node details:
2015-08-18 17:44:28 INFO cluster-relation-changed - node name: 'rabbitmqctl-17379@juju-beis0-machine-4'
2015-08-18 17:44:28 INFO cluster-relation-changed - home dir: /var/lib/rabbitmq
2015-08-18 17:44:28 INFO cluster-relation-changed - cookie hash: j7UJuJx3ZktAni0tPfaRxw==
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO juju-log cluster:1: Failed to cluster with juju-beis0-machine-2.
# rabbitmq-server/2 (juju-beis0-machine-4)
Name resolution is fine. Attempted to cluster with rabbitmq-server/0 (juju-beis0-machine-2), failed. Clustered ok with rabbitmq-server/1 (juju-beis0-machine-3).
Full unit log: http://paste.ubuntu.com/12119674/
root@juju-beis0-machine-4:/var/log/juju# cat /etc/hostname
juju-beis0-machine-4
root@juju-beis0-machine-4:/var/log/juju# ip a | grep gl
inet 172.18.99.100/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/0 (juju-beis0-machine-2)
Name resolution is fine. cluster-releation-* hooks never fired.
Full unit log: http://paste.ubuntu.com/12119672/
root@juju-beis0-machine-2:/var/log/juju# cat /etc/hostname
juju-beis0-machine-2
root@juju-beis0-machine-2:/var/log/juju# ip a | grep gl
inet 172.18.99.98/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/1 (juju-beis0-machine-3)
Name resolution is fine. Clustered ok with rabbitmq-server/2 (juju-beis0-machine-4).
Full unit log: http://paste.ubuntu.com/12119695/
root@juju-beis0-machine-3:/var/log/juju# cat /etc/hostname
juju-beis0-machine-3
root@juju-beis0-machine-3:/var/log/juju# ip a | grep gl
inet 172.18.99.99/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# VK juju stat
http://paste.ubuntu.com/12119730/
# rmq verions
ubuntu@beisner-bastion:~/bzr/next/rabbitmq-server/tests$ juju run --service rabbitmq-server "apt-cache policy rabbitmq-server"
- MachineId: "2"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/0
- MachineId: "3"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/1
- MachineId: "4"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/2 |
With a 3-node native cluster in Vivid-Kilo, Trusty-Juno, and Precise-Icehouse, in greater than 50% of all attempts, one of the rabbitmq-server units fails to cluster. When this happens, we end up with a 2-node cluster, a 1-node cluster, while juju status indicates happiness. In Trusty-Icehouse, the race is much less frequent.
The min-cluster-size and max-cluster-tries code does not appear to be hit. The above is observed with juju 1.24.5 with LE.
When I try with juju 1.22.1 (fallback cluster approach), I get no clustered units (ie. 3 separate single-node clusters).
Test scenario: a basic 3-node rabbitmq-server native cluster, with nrpe as a subordinate to exercise nrpe-external-master functionality, and with cinder to exercise and inspect amqp relation data.
DNS does not appear to play a role here, as all machines can resolve all other machines, forward and reverse, when this cluster failure is observed.
FYI, when the cluster does succeed on V-K, a separate, seemingly-unrelated bug is consistently hit (bug 1485722).
# VK amulet results
2015-08-18 17:49:03,637 test_300_rmq_config INFO: OK
2015-08-18 17:49:03,637 test_400_rmq_cluster_running_nodes DEBUG: Checking that all units are in cluster_status running nodes...
2015-08-18 17:49:08,219 get_unit_hostnames DEBUG: Unit host names: {'rabbitmq-server/2': 'juju-beis0-machine-4', 'rabbitmq-server/0': 'juju-beis0-machine-2', 'rabbitmq-server/1': 'juju-beis0-machine-3'}
2015-08-18 17:49:09,932 run_cmd_unit DEBUG: rabbitmq-server/0 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:09,932 get_rmq_cluster_status DEBUG: rabbitmq-server/0 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-2' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-2']}]},
{running_nodes,['rabbit@juju-beis0-machine-2']},
{cluster_name,<<"rabbit@juju-beis0-machine-2.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:11,578 run_cmd_unit DEBUG: rabbitmq-server/1 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:11,578 get_rmq_cluster_status DEBUG: rabbitmq-server/1 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-3' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-4','rabbit@juju-beis0-machine-3']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
2015-08-18 17:49:13,224 run_cmd_unit DEBUG: rabbitmq-server/2 `rabbitmqctl cluster_status` command returned 0 (OK)
2015-08-18 17:49:13,226 get_rmq_cluster_status DEBUG: rabbitmq-server/2 cluster_status:
Cluster status of node 'rabbit@juju-beis0-machine-4' ...
[{nodes,[{disc,['rabbit@juju-beis0-machine-3',
'rabbit@juju-beis0-machine-4']}]},
{running_nodes,['rabbit@juju-beis0-machine-3','rabbit@juju-beis0-machine-4']},
{cluster_name,<<"rabbit@juju-beis0-machine-4.openstacklocal">>},
{partitions,[]}]
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-3 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/0: rabbit@juju-beis0-machine-4 not in [u'rabbit@juju-beis0-machine-2']
Cluster member check failed on rabbitmq-server/1: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-4', u'rabbit@juju-beis0-machine-3']
Cluster member check failed on rabbitmq-server/2: rabbit@juju-beis0-machine-2 not in [u'rabbit@juju-beis0-machine-3', u'rabbit@juju-beis0-machine-4']
# VK rabbitmq-server/2 unit failed to cluster:
2015-08-18 17:44:27 INFO juju-log cluster:1: Clustering with remote rabbit host (juju-beis0-machine-2).
2015-08-18 17:44:27 INFO cluster-relation-changed Stopping node 'rabbit@juju-beis0-machine-4' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Clustering node 'rabbit@juju-beis0-machine-4' with 'rabbit@juju-beis0-machine-2' ...
2015-08-18 17:44:28 INFO cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-beis0-machine-2']: nodedown
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed DIAGNOSTICS
2015-08-18 17:44:28 INFO cluster-relation-changed ===========
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed attempted to contact: ['rabbit@juju-beis0-machine-2']
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed rabbit@juju-beis0-machine-2:
2015-08-18 17:44:28 INFO cluster-relation-changed * connected to epmd (port 4369) on juju-beis0-machine-2
2015-08-18 17:44:28 INFO cluster-relation-changed * epmd reports node 'rabbit' running on port 25672
2015-08-18 17:44:28 INFO cluster-relation-changed * TCP connection succeeded but Erlang distribution failed
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: hostname mismatch?
2015-08-18 17:44:28 INFO cluster-relation-changed * suggestion: is the cookie set correctly?
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO cluster-relation-changed current node details:
2015-08-18 17:44:28 INFO cluster-relation-changed - node name: 'rabbitmqctl-17379@juju-beis0-machine-4'
2015-08-18 17:44:28 INFO cluster-relation-changed - home dir: /var/lib/rabbitmq
2015-08-18 17:44:28 INFO cluster-relation-changed - cookie hash: j7UJuJx3ZktAni0tPfaRxw==
2015-08-18 17:44:28 INFO cluster-relation-changed
2015-08-18 17:44:28 INFO juju-log cluster:1: Failed to cluster with juju-beis0-machine-2.
# rabbitmq-server/2 (juju-beis0-machine-4)
Name resolution is fine. Attempted to cluster with rabbitmq-server/0 (juju-beis0-machine-2), failed. Clustered ok with rabbitmq-server/1 (juju-beis0-machine-3).
Full unit log: http://paste.ubuntu.com/12119674/
root@juju-beis0-machine-4:/var/log/juju# cat /etc/hostname
juju-beis0-machine-4
root@juju-beis0-machine-4:/var/log/juju# ip a | grep gl
inet 172.18.99.100/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-4:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-4:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/0 (juju-beis0-machine-2)
Name resolution is fine. cluster-releation-* hooks never fired.
Full unit log: http://paste.ubuntu.com/12119672/
root@juju-beis0-machine-2:/var/log/juju# cat /etc/hostname
juju-beis0-machine-2
root@juju-beis0-machine-2:/var/log/juju# ip a | grep gl
inet 172.18.99.98/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-2:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-2:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# rabbitmq-server/1 (juju-beis0-machine-3)
Name resolution is fine. Clustered ok with rabbitmq-server/2 (juju-beis0-machine-4).
Full unit log: http://paste.ubuntu.com/12119695/
root@juju-beis0-machine-3:/var/log/juju# cat /etc/hostname
juju-beis0-machine-3
root@juju-beis0-machine-3:/var/log/juju# ip a | grep gl
inet 172.18.99.99/24 brd 172.18.99.255 scope global eth0
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-2
juju-beis0-machine-2.openstacklocal has address 172.18.99.98
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-3
juju-beis0-machine-3.openstacklocal has address 172.18.99.99
root@juju-beis0-machine-3:/var/log/juju# host juju-beis0-machine-4
juju-beis0-machine-4.openstacklocal has address 172.18.99.100
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.98
98.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-2.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.99
99.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-3.openstacklocal.
root@juju-beis0-machine-3:/var/log/juju# host 172.18.99.100
100.99.18.172.in-addr.arpa domain name pointer juju-beis0-machine-4.openstacklocal.
# VK juju stat
http://paste.ubuntu.com/12119730/
# rmq verions
ubuntu@beisner-bastion:~/bzr/next/rabbitmq-server/tests$ juju run --service rabbitmq-server "apt-cache policy rabbitmq-server"
- MachineId: "2"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/0
- MachineId: "3"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/1
- MachineId: "4"
Stdout: |
rabbitmq-server:
Installed: 3.4.3-2
Candidate: 3.4.3-2
Version table:
*** 3.4.3-2 0
500 http://nova.clouds.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
100 /var/lib/dpkg/status
UnitId: rabbitmq-server/2 |
|
2015-08-20 15:33:10 |
Ryan Beisner |
summary |
3-node native cluster doesn't always cluster race: cluster-relation-changed Error: unable to connect to nodes ['rabbit@juju-X-machine-N']: nodedown |
3-node native rabbitmq cluster race |
|
2015-08-21 20:18:34 |
David Ames |
branch linked |
|
lp:~thedac/charms/trusty/rabbitmq-server/native-cluster-race-fixes |
|
2015-08-21 20:18:50 |
David Ames |
rabbitmq-server (Juju Charms Collection): assignee |
|
David Ames (thedac) |
|
2015-08-21 21:53:25 |
Ryan Beisner |
branch unlinked |
lp:~1chb1n/charms/trusty/rabbitmq-server/next.amulet-fix-20-delay |
|
|
2015-08-21 21:53:49 |
Ryan Beisner |
branch linked |
|
lp:~1chb1n/charms/trusty/rabbitmq-server/amulet-refactor-1508 |
|
2015-08-25 22:22:47 |
Billy Olsen |
tags |
amulet openstack uosci |
amulet backport-potential openstack uosci |
|
2015-08-25 23:52:57 |
Lei Wang |
bug |
|
|
added subscriber Ray Wang |
2015-08-31 14:21:23 |
Mario Splivalo |
bug |
|
|
added subscriber Mario Splivalo |
2015-09-01 01:20:02 |
Ryan Beisner |
branch linked |
|
lp:~1chb1n/charms/trusty/rabbitmq-server/amulet-refactor-1509 |
|
2015-09-01 10:25:58 |
Edward Hope-Morley |
rabbitmq-server (Juju Charms Collection): status |
Confirmed |
Fix Committed |
|
2015-09-01 10:26:01 |
Edward Hope-Morley |
rabbitmq-server (Juju Charms Collection): milestone |
|
15.10 |
|
2015-09-01 10:26:12 |
Edward Hope-Morley |
tags |
amulet backport-potential openstack uosci |
amulet backport-potential openstack sts uosci |
|
2015-09-01 14:24:24 |
Ryan Beisner |
rabbitmq-server (Juju Charms Collection): status |
Fix Committed |
Confirmed |
|
2015-09-02 20:01:46 |
Ryan Beisner |
branch unlinked |
lp:~1chb1n/charms/trusty/rabbitmq-server/amulet-refactor-1509 |
|
|
2015-09-09 13:00:06 |
Ryan Beisner |
branch linked |
|
lp:~1chb1n/ubuntu-openstack-ci/osi-1509 |
|
2015-09-09 13:03:32 |
Launchpad Janitor |
branch linked |
|
lp:~openstack-charmers/charms/trusty/rabbitmq-server/next |
|
2015-09-09 13:05:16 |
Ryan Beisner |
branch unlinked |
lp:~1chb1n/ubuntu-openstack-ci/osi-1509 |
|
|
2015-09-09 13:05:22 |
Ryan Beisner |
branch linked |
|
lp:~1chb1n/charms/trusty/rabbitmq-server/amulet-refactor-1509b |
|
2015-09-09 13:08:38 |
Ryan Beisner |
branch unlinked |
lp:~1chb1n/charms/trusty/rabbitmq-server/amulet-refactor-1508 |
|
|
2015-09-10 12:37:58 |
Adam Collard |
bug |
|
|
added subscriber Landscape |
2015-09-10 12:38:21 |
Jonathan Davies |
bug |
|
|
added subscriber Jonathan Davies |
2015-09-10 12:45:11 |
Andreas Hasenack |
tags |
amulet backport-potential openstack sts uosci |
amulet backport-potential cisco landscape openstack sts uosci |
|
2015-09-10 13:29:44 |
Adam Collard |
tags |
amulet backport-potential cisco landscape openstack sts uosci |
amulet backport-potential cisco landscape landscape-release-29 openstack sts uosci |
|
2015-09-10 13:40:25 |
Liam Young |
rabbitmq-server (Juju Charms Collection): status |
Confirmed |
Fix Committed |
|
2015-09-10 16:43:46 |
David Ames |
branch linked |
|
lp:~thedac/charms/trusty/rabbitmq-server/backport-cluster-race-fixes |
|
2015-09-11 14:03:53 |
Lorenzo Cavassa |
bug |
|
|
added subscriber Lorenzo Cavassa |
2015-09-16 08:13:22 |
Andreas Hasenack |
bug task added |
|
landscape |
|
2015-09-16 08:13:35 |
Andreas Hasenack |
nominated for series |
|
landscape/cisco-odl |
|
2015-09-16 08:13:35 |
Andreas Hasenack |
bug task added |
|
landscape/cisco-odl |
|
2015-09-16 08:13:49 |
Andreas Hasenack |
landscape/cisco-odl: importance |
Undecided |
High |
|
2015-09-16 08:13:51 |
Andreas Hasenack |
landscape: importance |
Undecided |
High |
|
2015-09-16 13:16:27 |
Andreas Hasenack |
landscape/cisco-odl: status |
New |
In Progress |
|
2015-09-16 13:16:29 |
Andreas Hasenack |
landscape/cisco-odl: assignee |
|
Andreas Hasenack (ahasenack) |
|
2015-09-16 13:22:06 |
Andreas Hasenack |
landscape/cisco-odl: milestone |
|
falkor-0.9 |
|
2015-09-16 22:51:24 |
Matt Rae |
bug |
|
|
added subscriber Matt Rae |
2015-09-17 14:07:40 |
Andreas Hasenack |
nominated for series |
|
landscape/release-29 |
|
2015-09-17 14:07:40 |
Andreas Hasenack |
bug task added |
|
landscape/release-29 |
|
2015-09-17 14:35:03 |
Andreas Hasenack |
landscape/release-29: status |
New |
In Progress |
|
2015-09-17 14:35:06 |
Andreas Hasenack |
landscape/release-29: importance |
Undecided |
High |
|
2015-09-17 14:35:09 |
Andreas Hasenack |
landscape/release-29: assignee |
|
Andreas Hasenack (ahasenack) |
|
2015-09-17 15:13:53 |
Andreas Hasenack |
landscape: status |
New |
In Progress |
|
2015-09-17 15:13:57 |
Andreas Hasenack |
landscape: assignee |
|
Andreas Hasenack (ahasenack) |
|
2015-09-17 17:35:40 |
🤖 Landscape Builder |
landscape/release-29: status |
In Progress |
Fix Committed |
|
2015-09-17 17:35:40 |
🤖 Landscape Builder |
landscape/release-29: milestone |
|
15.08 |
|
2015-09-18 07:24:29 |
Andreas Hasenack |
landscape: status |
In Progress |
Fix Committed |
|
2015-09-18 07:24:35 |
Andreas Hasenack |
landscape: milestone |
|
15.08 |
|
2015-09-18 07:24:38 |
Andreas Hasenack |
landscape/release-29: milestone |
15.08 |
15.07 |
|
2015-09-23 14:51:43 |
Andreas Hasenack |
landscape/cisco-odl: status |
In Progress |
Fix Committed |
|
2015-09-30 07:36:03 |
Björn Tillenius |
landscape/cisco-odl: status |
Fix Committed |
Fix Released |
|
2015-10-07 14:42:17 |
David Britton |
tags |
amulet backport-potential cisco landscape landscape-release-29 openstack sts uosci |
amulet backport-potential cisco landscape openstack sts uosci |
|
2015-10-13 15:04:36 |
Andreas Hasenack |
landscape/release-29: status |
Fix Committed |
Fix Released |
|
2015-10-13 15:04:39 |
Andreas Hasenack |
landscape: status |
Fix Committed |
Fix Released |
|
2015-10-13 15:04:42 |
Andreas Hasenack |
landscape: milestone |
15.08 |
15.07 |
|
2015-10-22 13:42:54 |
James Page |
rabbitmq-server (Juju Charms Collection): status |
Fix Committed |
Fix Released |
|