Sharded OpWQ drops suicide_grace after waiting for work

Bug #1840348 reported by Kellen Renshaw
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Fix Released
Medium
Dan Hill
Queens
In Progress
Medium
Kellen Renshaw
Rocky
Won't Fix
Medium
Dan Hill
Stein
Won't Fix
Medium
Dan Hill
Train
Fix Released
Medium
Dan Hill
ceph (Ubuntu)
Fix Released
Medium
Dan Hill
Bionic
Fix Committed
Medium
Kellen Renshaw
Eoan
Won't Fix
Medium
Dan Hill
Focal
Fix Released
Medium
Dan Hill

Bug Description

[Impact]
The Sharded OpWQ will opportunistically wait for more work when processing an empty queue. While waiting, the heartbeat timeout and suicide_grace values are modified. The `threadpool_default_timeout` grace is left applied and suicide_grace is disabled.

After finding work, the original work queue grace/suicide_grace values are not re-applied. This can result in hung operations that do not trigger an OSD suicide recovery.

The missing suicide recovery was observed on Luminous 12.2.11. The environment was consistently hitting a known authentication race condition (issue#37778 [0]) due to repeated OSD service restarts on a node exhibiting MCEs from a faulty DIMM.

The auth race condition would stall pg operations. In some cases, the hung ops would persist for hours without suicide recovery.

[Test Case]
I have not identified a reliable reproducer. Currently testing the fix by exercising I/O.

Recommend letting this bake upstream before considering a back-port.

[Regression Potential]
This fix improves suicide_grace coverage of the Sharded OpWq.

This change is made in a critical code path that drives client I/O. An OSD suicide will trigger a service restart and repeated restarts (flapping) will adversely impact cluster performance.

The fix mitigates risk by keeping the applied suicide_grace value consistent with the value applied before entering `OSD::ShardedOpWQ::_process()`. The fix is also restricted to the empty queue edge-case that drops the suicide_grace timeout. The suicide_grace value is only re-applied when work is found after waiting on an empty queue.

- In-Progress -
Opened upstream tracker for issue#45076 [1] and fix pr#34575 [2]

[0] https://tracker.ceph.com/issues/37778
[1] https://tracker.ceph.com/issues/45076
[2] https://github.com/ceph/ceph/pull/34575

Eric Desrochers (slashd)
tags: added: sts
Dan Hill (hillpd)
Changed in ceph (Ubuntu):
status: New → Triaged
assignee: nobody → Dan Hill (hillpd)
importance: Undecided → Medium
Revision history for this message
James Page (james-page) wrote :

@hillpd any update on this bug?

Dan Hill (hillpd)
summary: - Ceph 12.2.11-0ubuntu0.18.04.2 doesn't honor suicide_grace
+ Sharded OpWQ drops suicide_grace after waiting for work
Revision history for this message
Dan Hill (hillpd) wrote :

There are two edge-cases in 12.2.11 where a worker thread's suicide_grace value gets dropped:
[0] In the Threadpool context, Threadpool:worker() drops suicide_grace while waiting on an empty work queue.
[1] In the ShardedThreadpool context, OSD::ShardedOpWQ::_process() drops suicide_grace while opportunistically waiting for more work (to prevent additional lock contention).

The Threadpool context always re-assigns suicide_grace before driving any work. The ShardedThreadpool context does not follow this pattern. After delaying to find additional work, the default sharded work queue timeouts are not re-applied.

This oversight exists in Luminous on-wards. Mimic, and Nautilus have each reworked the ShardedOpWQ code path, but did not address the problem.

[0] https://github.com/ceph/ceph/blob/v12.2.11/src/common/WorkQueue.cc#L137
[1] https://github.com/ceph/ceph/blob/v12.2.11/src/osd/OSD.cc#L10476

description: updated
description: updated
Revision history for this message
Dan Hill (hillpd) wrote :

Attaching the proposed fix for 12.2.13 that I am testing.

Dan Hill (hillpd)
Changed in ceph (Ubuntu Bionic):
status: New → Confirmed
assignee: nobody → Dan Hill (hillpd)
Changed in ceph (Ubuntu Eoan):
assignee: nobody → Dan Hill (hillpd)
Changed in ceph (Ubuntu Bionic):
importance: Undecided → Medium
Changed in ceph (Ubuntu Eoan):
importance: Undecided → Medium
status: New → Confirmed
Changed in ceph (Ubuntu Focal):
status: Triaged → Confirmed
description: updated
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "ceph_12.2.13-0ubuntu0.18.04.1+20200409sf00238701b1.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Dan Hill (hillpd)
description: updated
Dan Hill (hillpd)
description: updated
Revision history for this message
Brian Murray (brian-murray) wrote :

The Eoan Ermine has reached end of life, so this bug will not be fixed for that release

Changed in ceph (Ubuntu Eoan):
status: Confirmed → Won't Fix
Dan Hill (hillpd)
Changed in ceph (Ubuntu Focal):
status: Confirmed → Fix Released
Revision history for this message
Dan Hill (hillpd) wrote :

The SRUs for 15.2.5 [0] and 14.2.11 [1] have been released and contain a fix for this issue.

We are currently evaluating the need for a fix in Luminous (Bionic/Queens) and Mimic (Stein).

[0] https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1898200
[1] https://bugs.launchpad.net/cloud-archive/+bug/1891077

Mathew Hodson (mhodson)
Changed in ceph (Ubuntu):
status: Confirmed → Fix Released
Dan Hill (hillpd)
Changed in cloud-archive:
assignee: nobody → Dan Hill (hillpd)
Dan Hill (hillpd)
Changed in ceph (Ubuntu Bionic):
status: Confirmed → In Progress
Changed in cloud-archive:
status: New → Fix Released
Dan Hill (hillpd)
Changed in cloud-archive:
importance: Undecided → Medium
Revision history for this message
Dan Bungert (dbungert) wrote :

@hillpd - should this still be in the sponsor queue? Are we still trying to SRU this to Bionic?

Revision history for this message
Kellen Renshaw (krenshaw) wrote :

@dbungert - Taking over this one for hillpd, we would like to pursue this backport to Bionic/Queens (dropping Stein).

Originally held back on it since there isn't a viable reproducer. Now, since the change has been in upstream since May 1, 2020, there is confidence that it doesn't introduce a regression.

Changed in ceph (Ubuntu Bionic):
assignee: Dan Hill (hillpd) → Kellen Renshaw (krenshaw)
Revision history for this message
Dan Streetman (ddstreet) wrote :

unsubscribing ubuntu sponsors team, as sponsorship for this is being requested from the openstack team

Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Kellen, or anyone else affected,

Accepted ceph into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ceph/12.2.13-0ubuntu0.18.04.9 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ceph (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
Revision history for this message
Kellen Renshaw (krenshaw) wrote :

@dbungert Apologies, got sidetracked and this version has been superceded. Can we rebase this debdiff on the 12.2.13-0ubuntu0.18.04.10 version?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.