Cinder-backup service reports as down during backup of large volumes
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Cinder |
Fix Released
|
Undecided
|
Gorka Eguileor |
Bug Description
Description of problem:
When performing a backup of a large cinder volume, the cinder-backup service shows as down in 'cinder service-list', however, the backup will succeed. This is causing monitoring software to incorrectly report that there is an issue.
Version-Release number of selected component (if applicable):
openstack-
How reproducible:
Every time a backup is made of a large volume that takes more than a few minutes to complete
Steps to Reproduce:
1. cinder backup-create <uuid> --name test --force
2. watch cinder service-list
Actual results:
Cinder backup service will report down until the backup is complete
Expected results:
Cinder backup service will remain up throughout the backup task
zheng yin (yin-zheng) wrote : | #1 |
Changed in openstack-vmwareapi-team: | |
assignee: | nobody → zheng yin (yin-zheng) |
affects: | openstack-vmwareapi-team → cinder |
Changed in cinder: | |
assignee: | zheng yin (yin-zheng) → nobody |
assignee: | nobody → zheng yin (yin-zheng) |
Changed in cinder: | |
status: | New → In Progress |
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master) | #2 |
Fix proposed to branch: master
Review: https:/
Changed in cinder: | |
assignee: | zheng yin (yin-zheng) → Gorka Eguileor (gorka) |
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master) | #3 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit af0f00bc52f79d9
Author: Gorka Eguileor <email address hidden>
Date: Wed Sep 13 19:46:17 2017 +0200
Run backup compression on native thread
Backup data compression is a CPU bound operation that will not yield to
other greenthreads, so given enough simultaneous backup operations they
will lead to other threads' starvation.
This is really problematic for DB connections, since starvation will
lead to connections getting dropped with errors such as "Lost connection
to MySQL server during query".
Detailed information on why these connections get dropped can be found
in comment "[31 Aug 2007 9:21] Magnus Blåudd" on this MySQL bug [1].
These DB issues may result in backups unnecessary ending in an "error"
state.
This patch fixes this by moving the compression to a native thread so
the cooperative multitasking in Cinder Backup can continue switching
threads.
[1] https:/
Closes-Bug: #1692775
Closes-Bug: #1719580
Change-Id: I1946dc0ad9cb7a
Changed in cinder: | |
status: | In Progress → Fix Released |
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/pike) | #4 |
Fix proposed to branch: stable/pike
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/pike) | #5 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/pike
commit 439f90da8e4c1cf
Author: Gorka Eguileor <email address hidden>
Date: Wed Sep 13 19:46:17 2017 +0200
Run backup compression on native thread
Backup data compression is a CPU bound operation that will not yield to
other greenthreads, so given enough simultaneous backup operations they
will lead to other threads' starvation.
This is really problematic for DB connections, since starvation will
lead to connections getting dropped with errors such as "Lost connection
to MySQL server during query".
Detailed information on why these connections get dropped can be found
in comment "[31 Aug 2007 9:21] Magnus Blåudd" on this MySQL bug [1].
These DB issues may result in backups unnecessary ending in an "error"
state.
This patch fixes this by moving the compression to a native thread so
the cooperative multitasking in Cinder Backup can continue switching
threads.
[1] https:/
Closes-Bug: #1692775
Closes-Bug: #1719580
Change-Id: I1946dc0ad9cb7a
(cherry picked from commit af0f00bc52f79d9
tags: | added: in-stable-pike |
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 12.0.0.0b1 | #6 |
This issue was fixed in the openstack/cinder 12.0.0.0b1 development milestone.
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 11.0.1 | #7 |
This issue was fixed in the openstack/cinder 11.0.1 release.
Chhavi Agarwal (chhagarw) wrote : | #8 |
I am still hitting the same issue even with the given fix
def _prepare_
if self.compressor is None:
return 'none', data
# Execute compression in native thread so it doesn't prevent
# cooperative greenthread switching.
Started the backup for 50GB volume
[root@pvc180 cinder]# cinder --service-type volume backup-list
+------
| ID | Volume ID | Status | Name | Size | Object Count | Container |
+------
| 4d2a21e4-
+------
cinder service-list shows down for the cinder-backup
+------
| Binary | Host | Zone | Status | State | Updated_at | Disabled Reason |
+------
| cinder-backup | pvc180.
| cinder-conductor | pvc180.
| cinder-health | pvc180.
| cinder-scheduler | pvc180.
| cinder-volume | evtds8870 | nova | enabled | up | 2017-11-
| cinder-volume | y0121v3700b | nova | enabled | up | 2017-11-
+------
Once the backup is completed and available cinder-backup service is backup up.
[root@pvc180 cinder]# cinder --service-type volume backup-list
+------
| ID | Volume ID | Status | Name | Size | Object Count | Container |
+------
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (driverfixes/ocata) | #9 |
Fix proposed to branch: driverfixes/ocata
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (driverfixes/mitaka) | #10 |
Fix proposed to branch: driverfixes/mitaka
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (driverfixes/newton) | #11 |
Fix proposed to branch: driverfixes/newton
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master) | #12 |
Fix proposed to branch: master
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (driverfixes/newton) | #13 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: driverfixes/newton
commit 41754fd57f27bba
Author: Gorka Eguileor <email address hidden>
Date: Wed Sep 13 19:46:17 2017 +0200
Run backup compression on native thread
Backup data compression is a CPU bound operation that will not yield to
other greenthreads, so given enough simultaneous backup operations they
will lead to other threads' starvation.
This is really problematic for DB connections, since starvation will
lead to connections getting dropped with errors such as "Lost connection
to MySQL server during query".
Detailed information on why these connections get dropped can be found
in comment "[31 Aug 2007 9:21] Magnus Blåudd" on this MySQL bug [1].
These DB issues may result in backups unnecessary ending in an "error"
state.
This patch fixes this by moving the compression to a native thread so
the cooperative multitasking in Cinder Backup can continue switching
threads.
[1] https:/
Closes-Bug: #1692775
Closes-Bug: #1719580
Change-Id: I1946dc0ad9cb7a
(cherry picked from commit af0f00bc52f79d9
(cherry picked from commit 439f90da8e4c1cf
(cherry picked from commit b241f93267646a6
tags: | added: in-driverfixes-newton |
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master) | #14 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit dd556fa755adca1
Author: Chhavi Agarwal <email address hidden>
Date: Tue Nov 7 07:05:49 2017 -0500
Run backup-restore operations on native thread
During huge backup file read write operations holds the CPU which
leads to thread starvation, and cause cinder backup service to
report down, as DB operations are impacted.
Proposed changes are to run CPU and file sensitive operations like
read, write, compress, decompress on a native thread.
Change-Id: I1f1d9c0d6e3f04
Closes-Bug: #1692775
Co-Authored-By: Gorka Eguileor <email address hidden>
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/pike) | #15 |
Fix proposed to branch: stable/pike
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 12.0.0.0b3 | #16 |
This issue was fixed in the openstack/cinder 12.0.0.0b3 development milestone.
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/pike) | #17 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/pike
commit 60bd878c3761b69
Author: Chhavi Agarwal <email address hidden>
Date: Tue Nov 7 07:05:49 2017 -0500
Run backup-restore operations on native thread
During huge backup file read write operations holds the CPU which
leads to thread starvation, and cause cinder backup service to
report down, as DB operations are impacted.
Proposed changes are to run CPU and file sensitive operations like
read, write, compress, decompress on a native thread.
Change-Id: I1f1d9c0d6e3f04
Closes-Bug: #1692775
Co-Authored-By: Gorka Eguileor <email address hidden>
(cherry picked from commit dd556fa755adca1
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 11.1.0 | #18 |
This issue was fixed in the openstack/cinder 11.1.0 release.
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (driverfixes/ocata) | #19 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: driverfixes/ocata
commit b241f93267646a6
Author: Gorka Eguileor <email address hidden>
Date: Wed Sep 13 19:46:17 2017 +0200
Run backup compression on native thread
Backup data compression is a CPU bound operation that will not yield to
other greenthreads, so given enough simultaneous backup operations they
will lead to other threads' starvation.
This is really problematic for DB connections, since starvation will
lead to connections getting dropped with errors such as "Lost connection
to MySQL server during query".
Detailed information on why these connections get dropped can be found
in comment "[31 Aug 2007 9:21] Magnus Blåudd" on this MySQL bug [1].
These DB issues may result in backups unnecessary ending in an "error"
state.
This patch fixes this by moving the compression to a native thread so
the cooperative multitasking in Cinder Backup can continue switching
threads.
[1] https:/
Closes-Bug: #1692775
Closes-Bug: #1719580
Change-Id: I1946dc0ad9cb7a
(cherry picked from commit af0f00bc52f79d9
(cherry picked from commit 439f90da8e4c1cf
tags: | added: in-driverfixes-ocata |
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (master) | #20 |
Change abandoned by Eric Harney (<email address hidden>) on branch: master
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/ocata) | #21 |
Fix proposed to branch: stable/ocata
Review: https:/
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (driverfixes/mitaka) | #22 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: driverfixes/mitaka
commit 0137bc6b0c0e875
Author: Gorka Eguileor <email address hidden>
Date: Wed Sep 13 19:46:17 2017 +0200
Run backup compression on native thread
Backup data compression is a CPU bound operation that will not yield to
other greenthreads, so given enough simultaneous backup operations they
will lead to other threads' starvation.
This is really problematic for DB connections, since starvation will
lead to connections getting dropped with errors such as "Lost connection
to MySQL server during query".
Detailed information on why these connections get dropped can be found
in comment "[31 Aug 2007 9:21] Magnus Blåudd" on this MySQL bug [1].
These DB issues may result in backups unnecessary ending in an "error"
state.
This patch fixes this by moving the compression to a native thread so
the cooperative multitasking in Cinder Backup can continue switching
threads.
[1] https:/
Closes-Bug: #1692775
Closes-Bug: #1719580
Change-Id: I1946dc0ad9cb7a
(cherry picked from commit af0f00bc52f79d9
(cherry picked from commit 439f90da8e4c1cf
(cherry picked from commit b241f93267646a6
(cherry picked from commit 41754fd57f27bba
tags: | added: in-driverfixes-mitaka |
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/ocata) | #23 |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/ocata
commit 173d4d0a4686a40
Author: Gorka Eguileor <email address hidden>
Date: Wed Sep 13 19:46:17 2017 +0200
Run backup compression on native thread
Backup data compression is a CPU bound operation that will not yield to
other greenthreads, so given enough simultaneous backup operations they
will lead to other threads' starvation.
This is really problematic for DB connections, since starvation will
lead to connections getting dropped with errors such as "Lost connection
to MySQL server during query".
Detailed information on why these connections get dropped can be found
in comment "[31 Aug 2007 9:21] Magnus Blåudd" on this MySQL bug [1].
These DB issues may result in backups unnecessary ending in an "error"
state.
This patch fixes this by moving the compression to a native thread so
the cooperative multitasking in Cinder Backup can continue switching
threads.
[1] https:/
Closes-Bug: #1692775
Closes-Bug: #1719580
Change-Id: I1946dc0ad9cb7a
(cherry picked from commit af0f00bc52f79d9
(cherry picked from commit 439f90da8e4c1cf
tags: | added: in-stable-ocata |
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 10.0.7 | #24 |
This issue was fixed in the openstack/cinder 10.0.7 release.
this bug from https:/ /bugzilla. redhat. com/show_ bug.cgi? id=1403948