volume backup tests timing out since 1/23

Bug #1745168 reported by Matt Riedemann
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
Gorka Eguileor
Matt Riedemann (mriedem)
Changed in cinder:
status: New → Confirmed
importance: Undecided → High
Jay Bryant (jsbryant)
Changed in cinder:
assignee: nobody → Gorka Eguileor (gorka)
Revision history for this message
Matt Riedemann (mriedem) wrote :

(9:48:27 AM) geguileo: Jan 24 15:04:16.031978 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: Traceback (most recent call last):
(9:48:28 AM) geguileo: Jan 24 15:04:16.032126 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
(9:48:30 AM) geguileo: Jan 24 15:04:16.032267 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: timer()
(9:48:32 AM) geguileo: Jan 24 15:04:16.032404 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
(9:48:35 AM) geguileo: Jan 24 15:04:16.032559 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: cb(*args, **kw)
(9:48:36 AM) geguileo: Jan 24 15:04:16.032703 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: File "/usr/local/lib/python2.7/dist-packages/eventlet/semaphore.py", line 145, in _do_acquire
(9:48:38 AM) geguileo: Jan 24 15:04:16.032840 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: waiter.switch()
(9:48:40 AM) geguileo: Jan 24 15:04:16.032990 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: error: cannot switch to a different thread
(9:48:44 AM) geguileo: jungleboyj: ^ that's happening because a native thread withing a greenthread is logging
(9:48:56 AM) geguileo: jungleboyj: I'll try to figure out now where is this happening
(9:49:22 AM) openstackgerrit: Aseel Awwad proposed openstack/cinder master: Add policy check for complete attachment API action https://review.openstack.org/536119
(9:49:23 AM) mriedem: geguileo: nice spot, because this merged the same day https://github.com/openstack/cinder/commit/1a569e15c35d772a8ca236d75cd1840a415983d3
(9:49:30 AM) mriedem: geguileo: something is logging where it shouldn't be

Revision history for this message
Jay Bryant (jsbryant) wrote :

Gorka found that a native thread within a greenthread is logging. Shouldn't be doing that.

Jan 24 15:04:16.031978 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: Traceback (most recent call last):
9:48 AM Jan 24 15:04:16.032126 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
9:48 AM Jan 24 15:04:16.032267 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: timer()
9:48 AM Jan 24 15:04:16.032404 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
9:48 AM Jan 24 15:04:16.032559 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: cb(*args, **kw)
9:48 AM Jan 24 15:04:16.032703 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: File "/usr/local/lib/python2.7/dist-packages/eventlet/semaphore.py", line 145, in _do_acquire
9:48 AM Jan 24 15:04:16.032840 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: waiter.switch()
9:48 AM Jan 24 15:04:16.032990 ubuntu-xenial-citycloud-sto2-0002119447 cinder-backup[11788]: error: cannot switch to a different thread

This change merged the same day: https://github.com/openstack/cinder/commit/1a569e15c35d772a8ca236d75cd1840a415983d3

Gorka is investigating.

Changed in cinder:
status: Confirmed → In Progress
Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/537437
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=c6cb84bd63135d62ccd7ed9b23b245c41e9f105c
Submitter: Zuul
Branch: master

commit c6cb84bd63135d62ccd7ed9b23b245c41e9f105c
Author: Gorka Eguileor <email address hidden>
Date: Wed Jan 24 17:00:19 2018 +0100

    Remove logging on Swift backup obj writer

    Since change I1f1d9c0d6e3f04f1ecd5ef7c5d813005ee116409 we are running
    parts of the backups on native threads, which due to an eventlet bug [1]
    have bad interactions with greenthreads, so we have to avoid any logging
    when executing code in a native thread.

    This patch removes the MD5 logging on the SwiftObjectWriter close
    method and adds comments and docstring referring to this limitation.

    [1] https://github.com/eventlet/eventlet/issues/432

    Closes-Bug: #1745168
    Change-Id: I0857cecd7d8ab0ee7e3e9bd6e15f4987ede4d653

Changed in cinder:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/537840

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :

This is still failing because swiftclient is logging:

http://logs.openstack.org/05/266605/33/check/neutron-tempest-dvr/8a5dd5f/logs/screen-c-bak.txt.gz#_Jan_25_19_26_01_896832

Jan 25 19:26:01.790939 ubuntu-xenial-rax-ord-0002156812 cinder-backup[661]: DEBUG cinder.backup.chunkeddriver [None req-e4242d75-d81d-4da0-ac09-61b6a9cd40c7 tempest-VolumesBackupsV39Test-401577712 None] Compressed 25165824 bytes of data to 24470 bytes using zlib. {{(pid=661) _prepare_output_data /opt/stack/new/cinder/cinder/backup/chunkeddriver.py:398}}
Jan 25 19:26:01.791312 ubuntu-xenial-rax-ord-0002156812 cinder-backup[661]: DEBUG cinder.backup.chunkeddriver [None req-e4242d75-d81d-4da0-ac09-61b6a9cd40c7 tempest-VolumesBackupsV39Test-401577712 None] About to put_object {{(pid=661) _backup_chunk /opt/stack/new/cinder/cinder/backup/chunkeddriver.py:359}}
Jan 25 19:26:01.896832 ubuntu-xenial-rax-ord-0002156812 cinder-backup[661]: DEBUG swiftclient [-] REQ: curl -i https://162.242.166.135:8080/v1/AUTH_373e3faa49bb451f96f52e66255b9bc3/volumebackups/volume_b2a53f19-15f8-4856-b10c-6356f9108b5c/20180125192508/az_nova_backup_98628883-0fad-4937-b084-d1c8a4878077-00021 -X PUT -H "Content-Length: 24470" -H "X-Auth-Token: gAAAAABaai8He4Rw..." {{(pid=661) http_log /usr/local/lib/python2.7/dist-packages/swiftclient/client.py:165}}
Jan 25 19:26:01.897489 ubuntu-xenial-rax-ord-0002156812 cinder-backup[661]: Traceback (most recent call last):
Jan 25 19:26:01.897808 ubuntu-xenial-rax-ord-0002156812 cinder-backup[661]: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, in fire_timers
Jan 25 19:26:01.898109 ubuntu-xenial-rax-ord-0002156812 cinder-backup[661]: timer()
Jan 25 19:26:01.898391 ubuntu-xenial-rax-ord-0002156812 cinder-backup[661]: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
Jan 25 19:26:01.898670 ubuntu-xenial-rax-ord-0002156812 cinder-backup[661]: cb(*args, **kw)
Jan 25 19:26:01.898958 ubuntu-xenial-rax-ord-0002156812 cinder-backup[661]: File "/usr/local/lib/python2.7/dist-packages/eventlet/semaphore.py", line 145, in _do_acquire
Jan 25 19:26:01.899237 ubuntu-xenial-rax-ord-0002156812 cinder-backup[661]: waiter.switch()
Jan 25 19:26:01.899542 ubuntu-xenial-rax-ord-0002156812 cinder-backup[661]: error: cannot switch to a different thread

That's happening within the SwiftObjectWriter.close() method.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/538027

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/538230

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/538027
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=e4989c103ff23cbf937faa8cc12c39e9720f9fd7
Submitter: Zuul
Branch: master

commit e4989c103ff23cbf937faa8cc12c39e9720f9fd7
Author: Matt Riedemann <email address hidden>
Date: Thu Jan 25 15:50:34 2018 -0500

    backup: set default swiftclient log level to WARN

    Similar to change I0857cecd7d8ab0ee7e3e9bd6e15f4987ede4d653, the
    swift backup driver cannot be logging stuff while in a thread during
    read and write operations. swiftclient logs requests and responses
    at DEBUG level, which can cause a thread switch and break the backup
    operation.

    This sets a default log level of WARN for swiftclient for the
    cinder-backup service.

    Change-Id: I67ac11276715290839c39fc1d0e333c61a38c461
    Closes-Bug: #1745168

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 12.0.0.0b3

This issue was fixed in the openstack/cinder 12.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/pike)

Reviewed: https://review.openstack.org/537840
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=a1084c237a8b025353abaf586c05672840b6ec6d
Submitter: Zuul
Branch: stable/pike

commit a1084c237a8b025353abaf586c05672840b6ec6d
Author: Gorka Eguileor <email address hidden>
Date: Wed Jan 24 17:00:19 2018 +0100

    Remove logging on Swift backup obj writer

    Since change I1f1d9c0d6e3f04f1ecd5ef7c5d813005ee116409 we are running
    parts of the backups on native threads, which due to an eventlet bug [1]
    have bad interactions with greenthreads, so we have to avoid any logging
    when executing code in a native thread.

    This patch removes the MD5 logging on the SwiftObjectWriter close
    method and adds comments and docstring referring to this limitation.

    [1] https://github.com/eventlet/eventlet/issues/432

    Closes-Bug: #1745168
    Change-Id: I0857cecd7d8ab0ee7e3e9bd6e15f4987ede4d653
    (cherry picked from commit c6cb84bd63135d62ccd7ed9b23b245c41e9f105c)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/538230
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=112ee583e0c4b6364e99fe15c00d82ec50ca0d6d
Submitter: Zuul
Branch: stable/pike

commit 112ee583e0c4b6364e99fe15c00d82ec50ca0d6d
Author: Matt Riedemann <email address hidden>
Date: Thu Jan 25 15:50:34 2018 -0500

    backup: set default swiftclient log level to WARN

    Similar to change I0857cecd7d8ab0ee7e3e9bd6e15f4987ede4d653, the
    swift backup driver cannot be logging stuff while in a thread during
    read and write operations. swiftclient logs requests and responses
    at DEBUG level, which can cause a thread switch and break the backup
    operation.

    This sets a default log level of WARN for swiftclient for the
    cinder-backup service.

    Change-Id: I67ac11276715290839c39fc1d0e333c61a38c461
    Closes-Bug: #1745168
    (cherry picked from commit e4989c103ff23cbf937faa8cc12c39e9720f9fd7)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 11.1.0

This issue was fixed in the openstack/cinder 11.1.0 release.

Revision history for this message
Matt Riedemann (mriedem) wrote :

This is showing up again:

http://status.openstack.org/elastic-recheck/#1745168

Regression?

Revision history for this message
Matt Riedemann (mriedem) wrote :

Per comment 17, the other hits are bug 1813217.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.