backport GC AIO to Luminous

Bug #1838858 reported by Jesse Williamson
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Invalid
Undecided
Unassigned
Pike
Won't Fix
High
James Page
Queens
Fix Released
High
James Page
ceph (Ubuntu)
Invalid
Undecided
Jesse Williamson
Bionic
Fix Released
High
James Page

Bug Description

[Impact]
In RGW deployments with large deltas for deleted objects, the garbage collector will easily fall behind on cleanup of deleted objects eventually resulting in OSD devices hitting storage capacity.

[Test Case]
Deploy Ceph + RADOS Gateway
Create millions of objects in an iterative loop
Start deleting objects in parallel at a high rate
Storage capacity will gradually reduce over time as RGW garbage collection falls behind.

[Tricky to reproduce without significant scale of objects + deletion but proposed fix has been tested in the field.]

[Regression Potential]
Low - fix has been accepted into Ceph Luminous upstream and will form part of the next point release. Change to use AIO for GC is also in later Ceph releases.

[Original Bug Report]
SRU for backport of the GC AIO feature to Ceph Luminous.

https://github.com/ceph/ceph/pull/28784

Revision history for this message
James Page (james-page) wrote :

This will be part of 12.2.13 - what's the urgency around this SRU?

We can cherry pick the fix OR wait for the point release.

Changed in ceph (Ubuntu):
status: New → Invalid
Changed in ceph (Ubuntu Bionic):
status: New → Triaged
importance: Undecided → High
Changed in cloud-archive:
status: New → Invalid
James Page (james-page)
description: updated
description: updated
James Page (james-page)
Changed in ceph (Ubuntu Bionic):
assignee: nobody → James Page (james-page)
James Page (james-page)
Changed in ceph (Ubuntu Bionic):
status: Triaged → In Progress
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Jesse, or anyone else affected,

Accepted ceph into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ceph/12.2.12-0ubuntu0.18.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ceph (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
Revision history for this message
Jesse Williamson (chardan) wrote :

There is an additional related upstream PR that we want to include at least parts of:
    https://github.com/ceph/ceph/pull/26601/files
    https://tracker.ceph.com/issues/38454

Revision history for this message
Jesse Williamson (chardan) wrote :
Revision history for this message
Dan Hill (hillpd) wrote :

The issues and fixes discuss in comments #3 and #4 are being tracked under lp#1843085.

They have AIO GC as a dependency, but are not required for this SRU.

Revision history for this message
James Page (james-page) wrote :
Revision history for this message
Jesse Williamson (chardan) wrote :

I've verified that AIO GC is working on bionic. First, I configured radosgw with:
rgw gc obj min wait = 5
rgw gc processor period = 60
rgw gc processor max time = 60

...and then tested with the Hot Sauce S3 tool, configured to create 100 10MB objects and
then request deletion, triggering the GC code:
./hsbench -a $akey -s $skey -u http://127.0.0.1:70 -z 10MB -n 100 -t 2 -b 1 -m ipd

In another window, I watched the GC list and saw that it grew when deletion was requested, and
indeed evicted objects until the count was once again 0.

tags: added: verification-done-bionic
removed: verification-needed verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ceph - 12.2.12-0ubuntu0.18.04.3

---------------
ceph (12.2.12-0ubuntu0.18.04.3) bionic; urgency=medium

  [ James Page ]
  * d/p/ceph-volume-wait-for-lvs.patch: Cherry pick inflight fix to
    ensure that required wal and db devices are present before
    activating OSD's (LP: #1828617).

  [ Jesse Williamson ]
  * d/p/civetweb-755-1.8-somaxconn-configurable*.patch: Backport changes
    to civetweb to allow tuning of SOMAXCONN in Ceph RADOS Gateway
    deployments (LP: #1838109).

  [ James Page ]
  * d/p/rgw-gc-use-aio.patch: Cherry pick fix to switch to using AIO for
    garbage collection of objects in the Ceph RADOS Gateway
    (LP: #1838858).

  [ Eric Desrochers ]
  * Ensure that daemons are not automatically restarted during package
    upgrades (LP: #1840347):
    - d/rules: Use "--no-restart-after-upgrade" and "--no-stop-on-upgrade"
      instead of "--no-restart-on-upgrade".
    - d/rules: Drop exclusion for ceph-[osd,mon,mds] for restarts.

 -- James Page <email address hidden> Fri, 30 Aug 2019 10:11:09 +0100

Changed in ceph (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for ceph has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
James Page (james-page) wrote : Please test proposed package

Hello Jesse, or anyone else affected,

Accepted ceph into queens-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:queens-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-queens-needed to verification-queens-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-queens-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-queens-needed
tags: added: verification-queens-done
removed: verification-queens-needed
Revision history for this message
James Page (james-page) wrote : Update Released

The verification of the Stable Release Update for ceph has completed successfully and the package has now been released to -updates. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
James Page (james-page) wrote :

This bug was fixed in the package ceph - 12.2.12-0ubuntu0.18.04.3~cloud0
---------------

 ceph (12.2.12-0ubuntu0.18.04.3~cloud0) xenial-queens; urgency=medium
 .
   * New update for the Ubuntu Cloud Archive.
 .
 ceph (12.2.12-0ubuntu0.18.04.3) bionic; urgency=medium
 .
   [ James Page ]
   * d/p/ceph-volume-wait-for-lvs.patch: Cherry pick inflight fix to
     ensure that required wal and db devices are present before
     activating OSD's (LP: #1828617).
 .
   [ Jesse Williamson ]
   * d/p/civetweb-755-1.8-somaxconn-configurable*.patch: Backport changes
     to civetweb to allow tuning of SOMAXCONN in Ceph RADOS Gateway
     deployments (LP: #1838109).
 .
   [ James Page ]
   * d/p/rgw-gc-use-aio.patch: Cherry pick fix to switch to using AIO for
     garbage collection of objects in the Ceph RADOS Gateway
     (LP: #1838858).
 .
   [ Eric Desrochers ]
   * Ensure that daemons are not automatically restarted during package
     upgrades (LP: #1840347):
     - d/rules: Use "--no-restart-after-upgrade" and "--no-stop-on-upgrade"
       instead of "--no-restart-on-upgrade".
     - d/rules: Drop exclusion for ceph-[osd,mon,mds] for restarts.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.