Activity log for bug #1459173

Date Who What changed Old value New value Message
2015-05-27 09:41:39 Bogdan Dobrelya bug added bug
2015-05-27 09:41:46 Bogdan Dobrelya fuel: status New Confirmed
2015-05-27 09:41:55 Bogdan Dobrelya fuel: importance Undecided High
2015-05-27 09:42:00 Bogdan Dobrelya fuel: assignee Bogdan Dobrelya (bogdando)
2015-05-27 09:42:03 Bogdan Dobrelya fuel: milestone 6.1
2015-05-27 09:42:11 Bogdan Dobrelya nominated for series fuel/5.1.x
2015-05-27 09:42:11 Bogdan Dobrelya bug task added fuel/5.1.x
2015-05-27 09:42:11 Bogdan Dobrelya nominated for series fuel/6.0.x
2015-05-27 09:42:11 Bogdan Dobrelya bug task added fuel/6.0.x
2015-05-27 09:42:16 Bogdan Dobrelya fuel/5.1.x: status New Confirmed
2015-05-27 09:42:19 Bogdan Dobrelya fuel/6.0.x: status New Confirmed
2015-05-27 09:42:21 Bogdan Dobrelya fuel/5.1.x: importance Undecided High
2015-05-27 09:42:22 Bogdan Dobrelya fuel/6.0.x: importance Undecided High
2015-05-27 09:42:30 Bogdan Dobrelya fuel/5.1.x: assignee Fuel Library Team (fuel-library)
2015-05-27 09:42:39 Bogdan Dobrelya bug added subscriber Vladimir Kuklin
2015-05-27 09:42:45 Bogdan Dobrelya fuel/5.1.x: milestone 6.0.2
2015-05-27 09:42:51 Bogdan Dobrelya fuel/5.1.x: milestone 6.0.2 5.1.2
2015-05-27 09:42:59 Bogdan Dobrelya fuel/6.0.x: milestone 6.0.2
2015-05-27 09:43:06 Bogdan Dobrelya fuel/6.0.x: assignee Fuel Library Team (fuel-library)
2015-05-27 09:44:35 Bogdan Dobrelya summary RabbitMQ may hang on the cluster node removal RabbitMQ cluster node removal operation may hang for ever
2015-05-27 09:45:21 Bogdan Dobrelya description This bug is not easy to reproduce. I managed to reproduce it only after ~300 consequent node failovers. The repro steps can be found here: https://bugs.launchpad.net/fuel/+bug/1458830 The issue is what this commands may does not work as expected: # rabbitmqctl eval "disconnect_node(list_to_atom(\"rabbit@node-1\"))."; time rabbitmqctl forget_cluster_node rabbit@node-1 and hangs for ever ending up in the situation when none of rabbitmq nodes can't re-join the cluster after faiover because they can't be forgotten and join_cluster reports they are already clustered. ISO info: build_id: 2015-05-20_08-41-33 build_number: '441' but manifests was synced with current master. This bug is not easy to reproduce. I managed to reproduce it only after ~300 consequent node failovers. The repro steps can be found here: https://bugs.launchpad.net/fuel/+bug/1458830 The issue is what the following commands may does not work as expected (we expected what disconnecting node should help to kick it from the cluster): # rabbitmqctl eval "disconnect_node(list_to_atom(\"rabbit@node-1\"))."; time rabbitmqctl forget_cluster_node rabbit@node-1 and hangs for ever ending up in the situation when none of rabbitmq nodes can't re-join the cluster after faiover because they can't be forgotten and join_cluster reports they are already clustered. ISO info:       build_id: 2015-05-20_08-41-33       build_number: '441' but manifests was synced with current master.
2015-05-27 09:45:49 Bogdan Dobrelya description This bug is not easy to reproduce. I managed to reproduce it only after ~300 consequent node failovers. The repro steps can be found here: https://bugs.launchpad.net/fuel/+bug/1458830 The issue is what the following commands may does not work as expected (we expected what disconnecting node should help to kick it from the cluster): # rabbitmqctl eval "disconnect_node(list_to_atom(\"rabbit@node-1\"))."; time rabbitmqctl forget_cluster_node rabbit@node-1 and hangs for ever ending up in the situation when none of rabbitmq nodes can't re-join the cluster after faiover because they can't be forgotten and join_cluster reports they are already clustered. ISO info:       build_id: 2015-05-20_08-41-33       build_number: '441' but manifests was synced with current master. This bug is not easy to reproduce. I managed to reproduce it only after ~300 consequent node failovers. The repro steps can be found here: https://bugs.launchpad.net/fuel/+bug/1458830 The issue is what the following commands may does not work as expected (we're expecting that disconnecting a node should help to kick it from the cluster): # rabbitmqctl eval "disconnect_node(list_to_atom(\"rabbit@node-1\"))."; time rabbitmqctl forget_cluster_node rabbit@node-1 and hangs for ever ending up in the situation when none of rabbitmq nodes can't re-join the cluster after faiover because they can't be forgotten and join_cluster reports they are already clustered. ISO info:       build_id: 2015-05-20_08-41-33       build_number: '441' but manifests was synced with current master.
2015-05-27 09:46:34 Bogdan Dobrelya description This bug is not easy to reproduce. I managed to reproduce it only after ~300 consequent node failovers. The repro steps can be found here: https://bugs.launchpad.net/fuel/+bug/1458830 The issue is what the following commands may does not work as expected (we're expecting that disconnecting a node should help to kick it from the cluster): # rabbitmqctl eval "disconnect_node(list_to_atom(\"rabbit@node-1\"))."; time rabbitmqctl forget_cluster_node rabbit@node-1 and hangs for ever ending up in the situation when none of rabbitmq nodes can't re-join the cluster after faiover because they can't be forgotten and join_cluster reports they are already clustered. ISO info:       build_id: 2015-05-20_08-41-33       build_number: '441' but manifests was synced with current master. This bug is not easy to reproduce. I managed to reproduce it only after ~300 consequent node failovers. The repro steps can be found here: https://bugs.launchpad.net/fuel/+bug/1458830 The issue is what the following commands may does not work as expected (we're expecting that disconnecting a node should help to kick it from the cluster): # rabbitmqctl eval "disconnect_node(list_to_atom(\"rabbit@node-1\"))."; time rabbitmqctl forget_cluster_node rabbit@node-1 and hangs for ever ending up in the situation when none of rabbitmq nodes can re-join the cluster on faiover because they can't be forgotten and join_cluster reports they are already clustered. ISO info:       build_id: 2015-05-20_08-41-33       build_number: '441' but manifests was synced with current master.
2015-05-27 10:21:26 Bogdan Dobrelya description This bug is not easy to reproduce. I managed to reproduce it only after ~300 consequent node failovers. The repro steps can be found here: https://bugs.launchpad.net/fuel/+bug/1458830 The issue is what the following commands may does not work as expected (we're expecting that disconnecting a node should help to kick it from the cluster): # rabbitmqctl eval "disconnect_node(list_to_atom(\"rabbit@node-1\"))."; time rabbitmqctl forget_cluster_node rabbit@node-1 and hangs for ever ending up in the situation when none of rabbitmq nodes can re-join the cluster on faiover because they can't be forgotten and join_cluster reports they are already clustered. ISO info:       build_id: 2015-05-20_08-41-33       build_number: '441' but manifests was synced with current master. This bug is not easy to reproduce. I managed to reproduce it only after ~300 consequent node failovers. The repro steps can be found here: https://bugs.launchpad.net/fuel/+bug/1458830 The issue is what the following commands may does not work as expected (we're expecting that disconnecting a node should help to kick it from the cluster, but the disconnect sometimes may fail and return false): # rabbitmqctl eval "disconnect_node(list_to_atom(\"rabbit@node-1\"))."; time rabbitmqctl forget_cluster_node rabbit@node-1 and hangs for ever ending up in the situation when none of rabbitmq nodes can re-join the cluster on faiover because they can't be forgotten and join_cluster reports they are already clustered. ISO info:       build_id: 2015-05-20_08-41-33       build_number: '441' but manifests was synced with current master.
2015-05-27 10:56:26 Bogdan Dobrelya description This bug is not easy to reproduce. I managed to reproduce it only after ~300 consequent node failovers. The repro steps can be found here: https://bugs.launchpad.net/fuel/+bug/1458830 The issue is what the following commands may does not work as expected (we're expecting that disconnecting a node should help to kick it from the cluster, but the disconnect sometimes may fail and return false): # rabbitmqctl eval "disconnect_node(list_to_atom(\"rabbit@node-1\"))."; time rabbitmqctl forget_cluster_node rabbit@node-1 and hangs for ever ending up in the situation when none of rabbitmq nodes can re-join the cluster on faiover because they can't be forgotten and join_cluster reports they are already clustered. ISO info:       build_id: 2015-05-20_08-41-33       build_number: '441' but manifests was synced with current master. This bug is not easy to reproduce. I managed to reproduce it only after ~300 consequent node failovers. The repro steps can be found here: https://bugs.launchpad.net/fuel/+bug/1458830 The issue is what the following commands may does not work as expected (we're expecting that disconnecting a node should help to kick it from the cluster, but the disconnect sometimes may fail and return false): # rabbitmqctl eval "disconnect_node(list_to_atom(\"rabbit@node-1\"))."; time rabbitmqctl forget_cluster_node rabbit@node-1 and hangs for ever ending up in the situation when none of rabbitmq nodes can re-join the cluster on faiover because they can't be forgotten and join_cluster reports they are already clustered. Note, that for the given scenario, the AMQP cluster retains completely down as nodes cannot join mnesia master and the latter one is running in broken state - rabbitmqctl list_channels hangs as well. Perhaps, only solution is to detect in monitor if list_channels hangs and restart the affected nodes. This will introduce full cluster downtime until new mnesia-master elected but at least will ensure the cluster reassembled. ISO info:       build_id: 2015-05-20_08-41-33       build_number: '441' but manifests was synced with current master.
2015-05-27 13:36:51 Bogdan Dobrelya summary RabbitMQ cluster node removal operation may hang for ever RabbitMQ cluster node removal operation may hang for ever as rabbitmqctl may hang
2015-05-27 14:15:28 OpenStack Infra fuel: status Confirmed In Progress
2015-05-27 17:37:37 OpenStack Infra fuel: status In Progress Fix Committed
2015-07-13 10:13:57 Bogdan Dobrelya fuel/5.1.x: assignee Fuel Library Team (fuel-library) MOS Sustaining (mos-sustaining)
2015-07-13 10:14:11 Bogdan Dobrelya fuel/6.0.x: assignee Fuel Library Team (fuel-library) MOS Sustaining (mos-sustaining)
2015-07-13 10:14:26 Bogdan Dobrelya fuel/5.1.x: status Confirmed Triaged
2015-07-13 10:14:28 Bogdan Dobrelya fuel/6.0.x: status Confirmed Triaged
2015-09-22 14:54:07 Bogdan Dobrelya tags ha rabbitmq
2015-09-26 11:07:27 Vitaly Sedelnik fuel/6.0.x: milestone 6.0.2 6.0.1
2015-10-26 12:55:51 Vitaly Sedelnik fuel/5.1.x: assignee MOS Maintenance (mos-maintenance) Denis Meltsaykin (dmeltsaykin)
2015-10-26 12:55:57 Vitaly Sedelnik fuel/6.0.x: assignee MOS Maintenance (mos-maintenance) Denis Meltsaykin (dmeltsaykin)
2015-10-26 12:55:59 Vitaly Sedelnik fuel/5.1.x: milestone 5.1.1-updates 5.1.1-mu-2
2015-10-26 12:56:02 Vitaly Sedelnik fuel/6.0.x: milestone 6.0-updates 6.0-mu-7
2015-10-26 13:44:23 Denis Meltsaykin fuel/5.1.x: status Triaged Won't Fix
2015-10-26 13:44:26 Denis Meltsaykin fuel/6.0.x: status Triaged Won't Fix
2015-10-26 15:07:45 Vitaly Sedelnik fuel/5.1.x: milestone 5.1.1-mu-2 5.1.1-updates
2015-10-26 15:07:48 Vitaly Sedelnik fuel/6.0.x: milestone 6.0-mu-7 6.0-updates