[cinder] Timeout while backing up large amount of volumes at same time

Bug #1793509 reported by Vladimir Khlyunev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Won't Fix
High
Vladimir Khlyunev

Bug Description

https://review.openstack.org/#/c/547128

Cinder consumes a lot of RAM during cinder volume backup which could lead to different errors (e.g. mysql lost connection to the server). We have to backport fix mentioned above.

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/cinder (9.0/mitaka)

Fix proposed to branch: 9.0/mitaka
Change author: Chaynika Saikia <email address hidden>
Review: https://review.fuel-infra.org/39302

Changed in mos:
status: New → In Progress
Changed in mos:
milestone: 9.x-updates → 9.2-mu-8
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/cinder (9.0/mitaka)

Reviewed: https://review.fuel-infra.org/39302
Submitter: Pkgs Jenkins <email address hidden>
Branch: 9.0/mitaka

Commit: 6d5e6dd9d6cf92136ba63ec6971349435df92b96
Author: Chaynika Saikia <email address hidden>
Date: Fri Sep 21 09:01:08 2018

Fix backup/restore error for ceph rbd backend

If a large volume is backed up or a lot of concurrent backups happen,
the cinder-backup service goes offline since a lot of these operations
have calls to the C code which are not run on native threads.

When many concurrent backup create/restore operations happen, then
if all of them are greenthreads, and since, when there is a call to the
C code, monkeypatching does not happen by eventlet. Hence, there is no
context switch to other threads until the call to the C library is
completed. This will block context switching to other green threads.
As a result, some of the backup create/restore operations might go to
error state.

The objects on which read/write operations or C function
calls are done are wrapped in Proxy objects so that they run as native
threads.

Change-Id: I75058c36085eb1a8adb26a95297e3a2039745a2c
Closes-Bug: #1793509
(cherry picked from commit f1d681875cd3b28860d7daab53695bad618900a3)

Changed in mos:
status: In Progress → Fix Committed
Revision history for this message
Dmitry (dtsapikov) wrote :

Verified on 9.2+mu8

Changed in mos:
status: Fix Committed → Fix Released
Revision history for this message
Denis Meltsaykin (dmeltsaykin) wrote :

Tempest tests started to fail, Fix is reverted: https://review.fuel-infra.org/#/c/39335/

Changed in mos:
status: Fix Released → Confirmed
milestone: 9.2-mu-8 → 9.2-mu-9
Revision history for this message
Denis Meltsaykin (dmeltsaykin) wrote :

The issue cannot be fixed as is, the workaround is to use more cinder-backup instances.

Changed in mos:
status: Confirmed → Won't Fix
milestone: 9.2-mu-9 → 9.x-updates
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.