Activity log for bug #1925035

Date Who What changed Old value New value Message
2021-04-19 15:03:05 Peter Matulis bug added bug
2021-04-19 15:13:28 Peter Matulis description The failover to a secondary site for Cinder volume replication (volumes backed by two Ceph clusters) takes too long (and can even fail) due to default Cinder configuration affecting timeouts and retries. I don't see why a deliberate choice should be subject to long timeouts/retries. $ juju run-action --wait cinder-ceph-a pre-failover $ cinder failover-host cinder@cinder-ceph-a $ juju run-action --wait cinder-ceph-a post-failover The workaround is to manually change the defaults pre-failover and then revert them post-failover: $ juju ssh cinder-ceph-a/0 > sudo apt install -y crudini > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connect_timeout 1 > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connection_retries 1 > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connection_interval 0 > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a replication_connect_timeout 1 > sudo systemctl restart cinder-volume > exit $ cinder failover-host cinder@cinder-ceph-a $ juju ssh cinder-ceph-a/0 > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connect_timeout > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connection_retries > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connection_interval > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a replication_connect_timeout > sudo systemctl restart cinder-volume > exit An alternative would be to implement a --force option in the cinder API client. The failover to a secondary site for Cinder volume replication (volumes backed by two Ceph clusters) takes too long (and can even fail) due to default Cinder configuration affecting timeouts and retries. I don't see why a deliberate choice should be subject to long timeouts/retries. The workaround is to manually change the defaults pre-failover and then revert them post-failover: $ juju ssh cinder-ceph-a/0 > sudo apt install -y crudini > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connect_timeout 1 > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connection_retries 1 > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connection_interval 0 > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a replication_connect_timeout 1 > sudo systemctl restart cinder-volume > exit $ cinder failover-host cinder@cinder-ceph-a $ juju ssh cinder-ceph-a/0 > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connect_timeout > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connection_retries > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connection_interval > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a replication_connect_timeout > sudo systemctl restart cinder-volume > exit Is it possible to have actions for this?: $ juju run-action --wait cinder-ceph-a pre-failover $ cinder failover-host cinder@cinder-ceph-a $ juju run-action --wait cinder-ceph-a post-failover An alternative would be to implement a --force option in the cinder API client.
2021-04-19 15:16:11 Peter Matulis description The failover to a secondary site for Cinder volume replication (volumes backed by two Ceph clusters) takes too long (and can even fail) due to default Cinder configuration affecting timeouts and retries. I don't see why a deliberate choice should be subject to long timeouts/retries. The workaround is to manually change the defaults pre-failover and then revert them post-failover: $ juju ssh cinder-ceph-a/0 > sudo apt install -y crudini > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connect_timeout 1 > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connection_retries 1 > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connection_interval 0 > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a replication_connect_timeout 1 > sudo systemctl restart cinder-volume > exit $ cinder failover-host cinder@cinder-ceph-a $ juju ssh cinder-ceph-a/0 > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connect_timeout > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connection_retries > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connection_interval > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a replication_connect_timeout > sudo systemctl restart cinder-volume > exit Is it possible to have actions for this?: $ juju run-action --wait cinder-ceph-a pre-failover $ cinder failover-host cinder@cinder-ceph-a $ juju run-action --wait cinder-ceph-a post-failover An alternative would be to implement a --force option in the cinder API client. The failover to a secondary site for Cinder volume replication (volumes backed by two Ceph clusters) takes too long (and can even fail) due to default Cinder configuration affecting timeouts and retries. In the situation where one site (cluster) is known to be down I don't see why a deliberate choice should be subject to long timeouts/retries. The workaround is to manually change the defaults pre-failover and then revert them post-failover: $ juju ssh cinder-ceph-a/0 > sudo apt install -y crudini > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connect_timeout 1 > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connection_retries 1 > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connection_interval 0 > sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a replication_connect_timeout 1 > sudo systemctl restart cinder-volume > exit $ cinder failover-host cinder@cinder-ceph-a $ juju ssh cinder-ceph-a/0 > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connect_timeout > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connection_retries > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connection_interval > sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a replication_connect_timeout > sudo systemctl restart cinder-volume > exit Is it possible to have actions for this?: $ juju run-action --wait cinder-ceph-a pre-failover $ cinder failover-host cinder@cinder-ceph-a $ juju run-action --wait cinder-ceph-a post-failover An alternative would be to implement a --force option in the cinder API client.
2021-05-05 22:06:03 Peter Matulis tags openstack-advocacy
2021-05-06 14:33:24 Alex Kavanagh charm-cinder-ceph: importance Undecided Wishlist
2021-05-06 14:33:24 Alex Kavanagh charm-cinder-ceph: status New Triaged
2021-05-06 14:33:31 Alex Kavanagh tags openstack-advocacy onboarding openstack-advocacy
2021-05-12 15:30:32 Corey Bryant tags onboarding openstack-advocacy good-first-bug onboarding openstack-advocacy
2021-05-12 15:31:45 Corey Bryant tags good-first-bug onboarding openstack-advocacy good-first-bug openstack-advocacy
2022-08-01 08:50:52 Muhammad Ahmad charm-cinder-ceph: assignee Muhammad Ahmad (ahmadfsbd)
2022-08-01 09:13:33 Muhammad Ahmad charm-cinder-ceph: assignee Muhammad Ahmad (ahmadfsbd)