Activity log for bug #1688581

Date Who What changed Old value New value Message
2017-05-05 14:17:46 Dmitry Mescheryakov bug added bug
2017-05-05 14:17:56 Dmitry Mescheryakov tags area-oslo
2017-05-05 14:18:03 Dmitry Mescheryakov tags area-oslo area-oslo customer-found
2017-05-05 14:18:07 Dmitry Mescheryakov mos: status New Confirmed
2017-05-05 14:18:09 Dmitry Mescheryakov mos: importance Undecided Medium
2017-05-05 14:18:11 Dmitry Mescheryakov mos: assignee Dmitry Mescheryakov (dmitrymex)
2017-05-05 14:18:14 Dmitry Mescheryakov mos: milestone 9.x-updates
2017-05-05 14:19:47 Dmitry Mescheryakov description Version: 9.x Steps to reproduce: 1. Deploy a MOS env with 3 controllers and 1 compute node 2. Download that file and save it as simulator.py: http://paste.openstack.org/show/608983/ That is a modified copy of upstream simulator, if you are curious, make a diff against https://github.com/openstack/oslo.messaging/blob/master/tools/simulator.py 3. Go to compute node and apply that patch http://paste.openstack.org/show/608984/ to /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/impl_rabbit.py 4. In console set the following variable: RABBIT_URL=rabbit://<user>:<pass>@<node_1_ip>:5673,<user>:<pass>@<node_2_ip>:5673,<user>:<pass>@<node_3_ip>:5673/ Populate user, pass and node_x_ip using the following parameters from /etc/nova/nova.conf: rabbit_hosts, rabbit_userid and rabbit_password 5. Run python simulator.py --url $RABBIT_URL rpc-server -w 1 6. Open another console to controller, which IP goes first in RABBIT_URL list. 7. Open yet another console to the compute node and populate RABBIT_URL variable here as well. 8. Here run python simulator.py --url $RABBIT_URL rpc-client --timeout 10 -m 2 -w 10 With that command simulator will send 2 messages (-m) with timeout set to 10 seconds (--timeout) and interval between messages 10 seconds (-w) 9. Wait for simulator to send the first message and receives response from rpc-server. It is done ones the following lines appear in console: 2017-05-05 14:05:22,129 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:05:23,153 DEBUG oslo_messaging._drivers.amqpdriver received reply msg_id: ... 10. Once you see these lines, quickly (you have 10 seconds to do that) switch to controller console opened in step #6 and here execute iptables -I OUTPUT 1 -p tcp --sport 5673 -j DROP That will block Rabbit traffic to that node. 11. Observe the following lines next: 2017-05-05 14:11:10,076 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:12:10,087 DEBUG oslo.messaging._drivers.impl_rabbit Received recoverable error from kombu What goes next is of no importance. Important here is that it takes oslo.messaging 60 seconds to surrender first attempt and try to reconnect while the timeout is 10 seconds. oslo.messaging should react much faster to the problem. Version: 9.x Steps to reproduce: 1. Deploy a MOS env with 3 controllers and 1 compute node 2. Download that file and save it as simulator.py: http://paste.openstack.org/show/608983/    That is a modified copy of upstream simulator, if you are curious, make a diff against https://github.com/openstack/oslo.messaging/blob/master/tools/simulator.py 3. Go to compute node and apply that patch http://paste.openstack.org/show/608984/ to /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/impl_rabbit.py 4. In console set the following variable:    RABBIT_URL=rabbit://<user>:<pass>@<node_1_ip>:5673,<user>:<pass>@<node_2_ip>:5673,<user>:<pass>@<node_3_ip>:5673/    Populate user, pass and node_x_ip using the following parameters from /etc/nova/nova.conf: rabbit_hosts, rabbit_userid and rabbit_password 5. Run    python simulator.py --url $RABBIT_URL rpc-server -w 1 6. Open another console to controller, which IP goes first in RABBIT_URL list. 7. Open yet another console to the compute node and populate RABBIT_URL variable here as well. 8. Here run    python simulator.py --url $RABBIT_URL rpc-client --timeout 10 -m 2 -w 10    With that command simulator will send 2 messages (-m) with timeout set to 10 seconds (--timeout) and interval between messages 10 seconds (-w) 9. Wait for simulator to send the first message and receives response from rpc-server. It is done ones the following lines appear in console: 2017-05-05 14:05:22,129 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:05:23,153 DEBUG oslo_messaging._drivers.amqpdriver received reply msg_id: ... 10. Once you see these lines, quickly (you have 10 seconds to do that) switch to controller console opened in step #6 and here execute     iptables -I OUTPUT 1 -p tcp --sport 5673 -j DROP     That will block Rabbit traffic to that node. 11. Observe the following lines next: 2017-05-05 14:11:10,076 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:12:10,087 DEBUG oslo.messaging._drivers.impl_rabbit Received recoverable error from kombu What goes next is of no importance. Important here is that it takes oslo.messaging 60 seconds to surrender first attempt and try to reconnect while the timeout is 10 seconds. oslo.messaging should react much faster to the problem. To remove iptables rule set in step #10 on controller execute iptables -D OUTPUT -p tcp --sport 5673 -j DROP
2017-05-10 11:42:34 Dmitry Mescheryakov description Version: 9.x Steps to reproduce: 1. Deploy a MOS env with 3 controllers and 1 compute node 2. Download that file and save it as simulator.py: http://paste.openstack.org/show/608983/    That is a modified copy of upstream simulator, if you are curious, make a diff against https://github.com/openstack/oslo.messaging/blob/master/tools/simulator.py 3. Go to compute node and apply that patch http://paste.openstack.org/show/608984/ to /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/impl_rabbit.py 4. In console set the following variable:    RABBIT_URL=rabbit://<user>:<pass>@<node_1_ip>:5673,<user>:<pass>@<node_2_ip>:5673,<user>:<pass>@<node_3_ip>:5673/    Populate user, pass and node_x_ip using the following parameters from /etc/nova/nova.conf: rabbit_hosts, rabbit_userid and rabbit_password 5. Run    python simulator.py --url $RABBIT_URL rpc-server -w 1 6. Open another console to controller, which IP goes first in RABBIT_URL list. 7. Open yet another console to the compute node and populate RABBIT_URL variable here as well. 8. Here run    python simulator.py --url $RABBIT_URL rpc-client --timeout 10 -m 2 -w 10    With that command simulator will send 2 messages (-m) with timeout set to 10 seconds (--timeout) and interval between messages 10 seconds (-w) 9. Wait for simulator to send the first message and receives response from rpc-server. It is done ones the following lines appear in console: 2017-05-05 14:05:22,129 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:05:23,153 DEBUG oslo_messaging._drivers.amqpdriver received reply msg_id: ... 10. Once you see these lines, quickly (you have 10 seconds to do that) switch to controller console opened in step #6 and here execute     iptables -I OUTPUT 1 -p tcp --sport 5673 -j DROP     That will block Rabbit traffic to that node. 11. Observe the following lines next: 2017-05-05 14:11:10,076 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:12:10,087 DEBUG oslo.messaging._drivers.impl_rabbit Received recoverable error from kombu What goes next is of no importance. Important here is that it takes oslo.messaging 60 seconds to surrender first attempt and try to reconnect while the timeout is 10 seconds. oslo.messaging should react much faster to the problem. To remove iptables rule set in step #10 on controller execute iptables -D OUTPUT -p tcp --sport 5673 -j DROP Version: 9.x Steps to reproduce: 1. Deploy a MOS env with 3 controllers and 1 compute node 2. Download that file and save it as simulator.py: http://paste.openstack.org/show/608983/    That is a modified copy of upstream simulator, if you are curious, make a diff against https://github.com/openstack/oslo.messaging/blob/master/tools/simulator.py 3. Go to compute node and apply that patch http://paste.openstack.org/show/608984/ to /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/impl_rabbit.py 4. In that console set the following variable:    RABBIT_URL=rabbit://<user>:<pass>@<node_1_ip>:5673,<user>:<pass>@<node_2_ip>:5673,<user>:<pass>@<node_3_ip>:5673/    Populate user, pass and node_x_ip using the following parameters from /etc/nova/nova.conf: rabbit_hosts, rabbit_userid and rabbit_password 5. Run    python simulator.py --url $RABBIT_URL rpc-server -w 1 6. Open another console to controller, which IP goes first in RABBIT_URL list. 7. Open yet another console to the compute node and populate RABBIT_URL variable here as well. 8. Here run    python simulator.py --url $RABBIT_URL rpc-client --timeout 10 -m 2 -w 10    With that command simulator will send 2 messages (-m) with timeout set to 10 seconds (--timeout) and interval between messages 10 seconds (-w) 9. Wait for simulator to send the first message and receives response from rpc-server. It is done ones the following lines appear in console: 2017-05-05 14:05:22,129 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:05:23,153 DEBUG oslo_messaging._drivers.amqpdriver received reply msg_id: ... 10. Once you see these lines, quickly (you have 10 seconds to do that) switch to controller console opened in step #6 and here execute     iptables -I OUTPUT 1 -p tcp --sport 5673 -j DROP     That will block AMQP traffic to that node. 11. Observe the following lines next: 2017-05-05 14:11:10,076 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:12:10,087 DEBUG oslo.messaging._drivers.impl_rabbit Received recoverable error from kombu What goes next is of no importance. Important here is that it takes oslo.messaging 60 seconds to surrender first attempt and try to reconnect while the timeout is 10 seconds. oslo.messaging should react much faster to the problem. To remove iptables rule set in step #10 on controller execute iptables -D OUTPUT -p tcp --sport 5673 -j DROP That is bug is very similar to https://bugs.launchpad.net/mos/+bug/1689801 and it would be easier to verify them together.
2017-05-10 11:46:54 Dmitry Mescheryakov description Version: 9.x Steps to reproduce: 1. Deploy a MOS env with 3 controllers and 1 compute node 2. Download that file and save it as simulator.py: http://paste.openstack.org/show/608983/    That is a modified copy of upstream simulator, if you are curious, make a diff against https://github.com/openstack/oslo.messaging/blob/master/tools/simulator.py 3. Go to compute node and apply that patch http://paste.openstack.org/show/608984/ to /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/impl_rabbit.py 4. In that console set the following variable:    RABBIT_URL=rabbit://<user>:<pass>@<node_1_ip>:5673,<user>:<pass>@<node_2_ip>:5673,<user>:<pass>@<node_3_ip>:5673/    Populate user, pass and node_x_ip using the following parameters from /etc/nova/nova.conf: rabbit_hosts, rabbit_userid and rabbit_password 5. Run    python simulator.py --url $RABBIT_URL rpc-server -w 1 6. Open another console to controller, which IP goes first in RABBIT_URL list. 7. Open yet another console to the compute node and populate RABBIT_URL variable here as well. 8. Here run    python simulator.py --url $RABBIT_URL rpc-client --timeout 10 -m 2 -w 10    With that command simulator will send 2 messages (-m) with timeout set to 10 seconds (--timeout) and interval between messages 10 seconds (-w) 9. Wait for simulator to send the first message and receives response from rpc-server. It is done ones the following lines appear in console: 2017-05-05 14:05:22,129 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:05:23,153 DEBUG oslo_messaging._drivers.amqpdriver received reply msg_id: ... 10. Once you see these lines, quickly (you have 10 seconds to do that) switch to controller console opened in step #6 and here execute     iptables -I OUTPUT 1 -p tcp --sport 5673 -j DROP     That will block AMQP traffic to that node. 11. Observe the following lines next: 2017-05-05 14:11:10,076 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:12:10,087 DEBUG oslo.messaging._drivers.impl_rabbit Received recoverable error from kombu What goes next is of no importance. Important here is that it takes oslo.messaging 60 seconds to surrender first attempt and try to reconnect while the timeout is 10 seconds. oslo.messaging should react much faster to the problem. To remove iptables rule set in step #10 on controller execute iptables -D OUTPUT -p tcp --sport 5673 -j DROP That is bug is very similar to https://bugs.launchpad.net/mos/+bug/1689801 and it would be easier to verify them together. Version: 9.x Steps to reproduce: 1. Deploy a MOS env with 3 controllers and 1 compute node 2. Download that file and save it as simulator.py: http://paste.openstack.org/show/608983/    That is a modified copy of upstream simulator, if you are curious, make a diff against https://github.com/openstack/oslo.messaging/blob/master/tools/simulator.py 3. Go to compute node and apply that patch http://paste.openstack.org/show/608984/ to /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/impl_rabbit.py 4. In that console set the following variable:    RABBIT_URL=rabbit://<user>:<pass>@<node_1_ip>:5673,<user>:<pass>@<node_2_ip>:5673,<user>:<pass>@<node_3_ip>:5673/    Populate user, pass and node_x_ip using the following parameters from /etc/nova/nova.conf: rabbit_hosts, rabbit_userid and rabbit_password 5. Run    python simulator.py --url $RABBIT_URL rpc-server -w 1 6. Open another console to controller, which IP goes first in RABBIT_URL list. 7. Open yet another console to the compute node and populate RABBIT_URL variable here as well. 8. Here run    python simulator.py --url $RABBIT_URL rpc-client --timeout 10 -m 2 -w 10    With that command simulator will send 2 messages (-m) with timeout set to 10 seconds (--timeout) and interval between messages 10 seconds (-w) 9. Wait for simulator to send the first message and receives response from rpc-server. It is done ones the following lines appear in console: 2017-05-05 14:05:22,129 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:05:23,153 DEBUG oslo_messaging._drivers.amqpdriver received reply msg_id: ... 10. Once you see these lines, quickly (you have 10 seconds to do that) switch to controller console opened in step #6 and here execute     iptables -I OUTPUT 1 -p tcp --sport 5673 -j DROP     That will block AMQP traffic to that node. 11. Observe the following lines next: 2017-05-05 14:11:10,076 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:12:10,087 DEBUG oslo.messaging._drivers.impl_rabbit Received recoverable error from kombu What goes next is of no importance. Important here is that it takes oslo.messaging 60 seconds to surrender first attempt and try to reconnect while the timeout is 10 seconds. oslo.messaging should react much faster to the problem. To remove iptables rule set in step #10 on controller execute iptables -D OUTPUT -p tcp --sport 5673 -j DROP That bug is very similar to https://bugs.launchpad.net/mos/+bug/1689801 and it would be easier to verify them together.
2017-05-19 09:33:20 Dmitry Mescheryakov mos: status Confirmed In Progress
2017-06-19 12:41:31 Denis Meltsaykin mos: milestone 9.x-updates 9.2-mu-3
2017-06-29 10:41:16 Dmitry Mescheryakov description Version: 9.x Steps to reproduce: 1. Deploy a MOS env with 3 controllers and 1 compute node 2. Download that file and save it as simulator.py: http://paste.openstack.org/show/608983/    That is a modified copy of upstream simulator, if you are curious, make a diff against https://github.com/openstack/oslo.messaging/blob/master/tools/simulator.py 3. Go to compute node and apply that patch http://paste.openstack.org/show/608984/ to /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/impl_rabbit.py 4. In that console set the following variable:    RABBIT_URL=rabbit://<user>:<pass>@<node_1_ip>:5673,<user>:<pass>@<node_2_ip>:5673,<user>:<pass>@<node_3_ip>:5673/    Populate user, pass and node_x_ip using the following parameters from /etc/nova/nova.conf: rabbit_hosts, rabbit_userid and rabbit_password 5. Run    python simulator.py --url $RABBIT_URL rpc-server -w 1 6. Open another console to controller, which IP goes first in RABBIT_URL list. 7. Open yet another console to the compute node and populate RABBIT_URL variable here as well. 8. Here run    python simulator.py --url $RABBIT_URL rpc-client --timeout 10 -m 2 -w 10    With that command simulator will send 2 messages (-m) with timeout set to 10 seconds (--timeout) and interval between messages 10 seconds (-w) 9. Wait for simulator to send the first message and receives response from rpc-server. It is done ones the following lines appear in console: 2017-05-05 14:05:22,129 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:05:23,153 DEBUG oslo_messaging._drivers.amqpdriver received reply msg_id: ... 10. Once you see these lines, quickly (you have 10 seconds to do that) switch to controller console opened in step #6 and here execute     iptables -I OUTPUT 1 -p tcp --sport 5673 -j DROP     That will block AMQP traffic to that node. 11. Observe the following lines next: 2017-05-05 14:11:10,076 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:12:10,087 DEBUG oslo.messaging._drivers.impl_rabbit Received recoverable error from kombu What goes next is of no importance. Important here is that it takes oslo.messaging 60 seconds to surrender first attempt and try to reconnect while the timeout is 10 seconds. oslo.messaging should react much faster to the problem. To remove iptables rule set in step #10 on controller execute iptables -D OUTPUT -p tcp --sport 5673 -j DROP That bug is very similar to https://bugs.launchpad.net/mos/+bug/1689801 and it would be easier to verify them together. Version: 9.x Steps to reproduce: 1. Deploy a MOS env with 3 controllers and 1 compute node 2. Download that file and save it as simulator.py: http://paste.openstack.org/show/608983/    That is a modified copy of upstream simulator, if you are curious, make a diff against https://github.com/openstack/oslo.messaging/blob/master/tools/simulator.py 3. Go to compute node and apply that patch http://paste.openstack.org/show/608984/ to /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/impl_rabbit.py 4. In that console set the following variable:    RABBIT_URL=rabbit://<user>:<pass>@<node_1_ip>:5673,<user>:<pass>@<node_2_ip>:5673,<user>:<pass>@<node_3_ip>:5673/    Populate user, pass and node_x_ip using the following parameters from /etc/nova/nova.conf: rabbit_hosts, rabbit_userid and rabbit_password 5. Run    python simulator.py --url $RABBIT_URL rpc-server -w 1 6. Open another console to controller, which IP goes first in RABBIT_URL list. 7. Open yet another console to the compute node and populate RABBIT_URL variable here as well. 8. Here run    python simulator.py --url $RABBIT_URL rpc-client --timeout 10 -m 2 -w 10    With that command simulator will send 2 messages (-m) with timeout set to 10 seconds (--timeout) and interval between messages 10 seconds (-w) 9. Wait for simulator to send the first message and receives response from rpc-server. It is done once the following lines appear in console: 2017-05-05 14:05:22,129 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:05:23,153 DEBUG oslo_messaging._drivers.amqpdriver received reply msg_id: ... 10. Once you see these lines, quickly (you have 10 seconds to do that) switch to controller console opened in step #6 and here execute     iptables -I OUTPUT 1 -p tcp --sport 5673 -j DROP     That will block AMQP traffic to that node. 11. Observe the following lines next: 2017-05-05 14:11:10,076 DEBUG oslo_messaging._drivers.amqpdriver CALL msg_id: ... 2017-05-05 14:12:10,087 DEBUG oslo.messaging._drivers.impl_rabbit Received recoverable error from kombu What goes next is of no importance. Important here is that it takes oslo.messaging 60 seconds to surrender first attempt and try to reconnect while the timeout is 10 seconds. oslo.messaging should react much faster to the problem. To remove iptables rule set in step #10 on controller execute iptables -D OUTPUT -p tcp --sport 5673 -j DROP That bug is very similar to https://bugs.launchpad.net/mos/+bug/1689801 and it would be easier to verify them together.
2017-07-03 13:24:00 Fuel Devops McRobotson mos: status In Progress Fix Committed
2017-10-13 09:11:57 Ilya Bumarskov mos: status Fix Committed Fix Released