Ubuntu
ceph package

[SRU] ceph-osd takes all memory at boot

Bug #1978913 reported by nikhil kshirsagar on 2022-06-16

This bug affects 1 person

	Status	Importance	Assigned to
Ubuntu Cloud Archive	Invalid	Undecided	Unassigned
Queens	Invalid	Undecided	Unassigned
Ussuri	Fix Released	High	Unassigned
Wallaby	Invalid	Undecided	Unassigned
Xena	Invalid	Undecided	Unassigned
Yoga	Invalid	Undecided	Unassigned
ceph (Ubuntu)	Fix Released	Undecided	Unassigned
Bionic	Invalid	Undecided	Unassigned
Focal	Fix Released	High	nikhil kshirsagar
Jammy	Invalid	Undecided	Unassigned
Kinetic	Invalid	Undecided	Unassigned

Bug Description

[Impact]
The OSD will fail to trim the pg log dup entries, which could result in millions of dup entries for a PG while it was supposed to be at most 3000 (controlled by option osd_pg_log_dups_tracked).

This could cause OSD to run out of memory and crash, and it might not be able to start up again due to the need of loading millions of dup entries. This could happen to multiple OSDs at the same time (as also reported by many community users), so we may get a completely unusable cluster if we hit this issue.

The current known trigger for this problem is the pg split, as the whole dup entries will be copied during the pg split. The reason we don’t observe this so often before is that the pg autoscale wasn’t turned on by default, it’s on by default since from octopus.

Note that there is also no way to check the number of dups in a PG online.

[Test Plan]
To see the problem, follow this approach for a test cluster, with for eg. 3 OSDs,

#ps -eaf | grep osd
root 334891 1 0 Sep21 ? 00:42:03 /home/nikhil/Downloads/ceph_build_oct/ceph/build/bin/ceph-osd -i 0 -c /home/nikhil/Downloads/ceph_build_oct/ceph/build/ceph.conf
root 335541 1 0 Sep21 ? 00:40:20 /home/nikhil/Downloads/ceph_build_oct/ceph/build/bin/ceph-osd -i 2 -c /home/nikhil/Downloads/ceph_build_oct/ceph/build/ceph.conf

kill all OSDs, so they're down,

root@focal-new:/home/nikhil/Downloads/ceph_build_oct/ceph/build# ceph -s
2022-09-22T08:26:15.120+0000 7fa9694fe700 -1 WARNING: all dangerous and experimental features are enabled.
2022-09-22T08:26:15.140+0000 7fa963fff700 -1 WARNING: all dangerous and experimental features are enabled.
  cluster:
    id: 9e7c0a82-8072-4c48-b697-1e6399b4fc9e
    health: HEALTH_WARN
            2 osds down
            1 host (3 osds) down
            1 root (3 osds) down
            Reduced data availability: 169 pgs stale
            Degraded data redundancy: 255/765 objects degraded (33.333%), 64 pgs degraded, 169 pgs undersized

  services:
    mon: 3 daemons, quorum a,b,c (age 3s)
    mgr: x(active, since 28h)
    mds: a:1 {0=a=up:active}
    osd: 3 osds: 0 up (since 83m), 2 in (since 91m)
    rgw: 1 daemon active (8000)

task status:

  data:
    pools: 7 pools, 169 pgs
    objects: 255 objects, 9.5 KiB
    usage: 4.1 GiB used, 198 GiB / 202 GiB avail
    pgs: 255/765 objects degraded (33.333%)
             105 stale+active+undersized
             64 stale+active+undersized+degraded

Then inject dups using this json for all OSDs,

root@nikhil-Lenovo-Legion-Y540-15IRH-PG0:/home/nikhil/HDD_MOUNT/Downloads/ceph_build_oct/ceph/build# cat bin/dups.json
[
{"reqid": "client.4177.0:0",
"version": "3'0",
"user_version": "0",
"generate": "500000",
"return_code": "0"}
]

Use the ceph-objectstore-tool with the --pg-log-inject-dups parameter, to inject dups for all OSDs.

root@focal-new:/home/nikhil/Downloads/ceph_build_oct/ceph/build# ./bin/ceph-objectstore-tool --data-path dev/osd0/ --op pg-log-inject-dups --file bin/dups.json --no-mon-config --pgid 2.1e

root@focal-new:/home/nikhil/Downloads/ceph_build_oct/ceph/build# ./bin/ceph-objectstore-tool --data-path dev/osd1/ --op pg-log-inject-dups --file bin/dups.json --no-mon-config --pgid 2.1e

root@focal-new:/home/nikhil/Downloads/ceph_build_oct/ceph/build# ./bin/ceph-objectstore-tool --data-path dev/osd2/ --op pg-log-inject-dups --file bin/dups.json --no-mon-config --pgid 2.1e

Then set osd debug level to 20 (since here is the log that actually doing the trim: https://github.com/ceph/ceph/pull/47046/commits/aada08acde7a05ad769bb7a886ebcece628d522c#diff-b293fb673637ea53b5874bbb04f8f0638ca39cab009610e2cbc40a867bca4906L138, so need debug_osd = 20)

set debug osd=20 in global in ceph.conf,

root@focal-new:/home/nikhil/Downloads/ceph_build_oct/ceph/build# cat ceph.conf | grep "debug osd"
debug osd=20

Then bring up the OSDs

/home/nikhil/Downloads/ceph_build_oct/ceph/build/bin/ceph-osd -i 0 -c /home/nikhil/Downloads/ceph_build_oct/ceph/build/ceph.conf

/home/nikhil/Downloads/ceph_build_oct/ceph/build/bin/ceph-osd -i 1 -c /home/nikhil/Downloads/ceph_build_oct/ceph/build/ceph.conf

/home/nikhil/Downloads/ceph_build_oct/ceph/build/bin/ceph-osd -i 2 -c /home/nikhil/Downloads/ceph_build_oct/ceph/build/ceph.conf

Run some IO on the OSDs. Wait at least a few hours.

Then take the OSDs down (so the command below can be run), and run,

root@focal-new:/home/nikhil/Downloads/ceph_build_oct/ceph/build# ./bin/ceph-objectstore-tool --data-path dev/osd1/ --no-mon-config --pgid 2.1e --op log > op.log

You will see at the end of that output in the file op.log, the number of dups is still as it was when they were injected, (no trimming has taken place)

            {
                "reqid": "client.4177.0:0",
                "version": "3'499999",
                "user_version": "0",
                "return_code": "0"
            },
            {
                "reqid": "client.4177.0:0", <-- note the id (4177)
                "version": "3'500000", <---
                "user_version": "0",
                "return_code": "0"
            }
        ]
    },
    "pg_missing_t": {
        "missing": [],
        "may_include_deletes": true
    }

To verify the patch:
With the patch in place, once the dups are injected, output of ./bin/ceph-objectstore-tool --data-path dev/osd1/ --no-mon-config --pgid 2.1f --op log will again show the dups (this command should be run with the OSDs down, like before).

Then bring up the OSDs and start IO using rbd bench-write, leave the IO running a few hours, till these logs (https://github.com/ceph/ceph/pull/47046/commits/aada08acde7a05ad769bb7a886ebcece628d522c#diff-b293fb673637ea53b5874bbb04f8f0638ca39cab009610e2cbc40a867bca4906L138) are seen as below, in the osd logs, with the same client ID (4177 in my example) as the one that the client that injected the dups had used,

root@focal-new:/home/nikhil/Downloads/ceph_build_oct/ceph/build/out# cat osd.1.log | grep -i "trim dup " | grep 4177 | more

2022-09-26T10:30:53.125+0000 7fdb72741700 1 trim dup log_dup(reqid=client.4177.0:0 v=3'5 uv=0 rc=0)
...
...
2022-09-26T10:30:53.125+0000 7fdb72741700 1 trim dup log_dup(reqid=client.4177.0:0 v=3'52 uv=0 rc=0)

# grep -ri "trim dup " *.log | grep 4177 | wc -l
390001 <-- total of all OSDs, should be ~ 3x what is seen in the below output (dups trimmed till 130001) if you have 3 OSDs for eg. Basically this number of trimmed dup logs are from all OSDs combined.

And the output of ./bin/ceph-objectstore-tool --data-path dev/osd1/ --no-mon-config --pgid 2.1f --op log (you would need to take the particular OSD down for verifying this) will show that the first bunch of (130k for eg. here) dups have been trimmed already, see the "version", which starts with the figure 3'130001 instead of 0 now,

"dups": [
            {
                "reqid": "client.4177.0:0",
                "version": "3'130001", <----
                "user_version": "0",
                "return_code": "0"
            },
            {
                "reqid": "client.4177.0:0",
                "version": "3'130002",
                "user_version": "0",
                "return_code": "0"
            },

This will verify that the dups are being trimmed by the patch, and it is working correctly. And of course, OSDs should not go OOM at boot time!

[Where problems could occur]
This is not a clean cherry-pick due to some differences in the octopus and master codebases, related to RocksDBStore and Objectstore. (see https://github.com/ceph/ceph/pull/47046#issuecomment-1243252126).

Also, an earlier attempt to fix this issue upstream was reverted, as discussed at https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1978913/comments/1

While this fix has been tested and validated after building it into the upstream 15.2.17 release (please see the [Test Plan] section), we would still need to proceed with extreme caution by allowing some time for problems (if any) to surface before going ahead with this SRU, and running our QA tests on the packages that build this fix into the 15.2.17 release before releasing it to the customer who await this fix on octopus.

[Other Info]
The way this is fixed is that the PGLog needs to trim duplicates by the number of entries rather than the versions. That way, we prevent unbounded duplicate growth.

Reported upstream at https://tracker.ceph.com/issues/53729 and fixed on master through https://github.com/ceph/ceph/pull/47046

See original description

Tags:

Dan Hill (hillpd) on 2022-08-16

tags:

added: seg

Revision history for this message

nikhil kshirsagar (nkshirsagar) wrote on 2022-08-24:

upstream reverted https://github.com/ceph/ceph/pull/45529 and https://github.com/ceph/ceph/pull/46253 (see https://github.com/ceph/ceph/pull/46610 and https://github.com/ceph/ceph/pull/46611)

The proper fix now is through https://github.com/ceph/ceph/pull/47046

Revision history for this message

nikhil kshirsagar (nkshirsagar) wrote on 2022-09-30 (last edit on 2022-11-01):

Attached octopus backport of https://github.com/ceph/ceph/pull/47046. This is for merge into 15.2.17 point release since https://github.com/ceph/ceph/pull/47046 will not be merged into upstream Octopus due to it being EOL now.

I have built it into the 15.2.17 upstream branch, and tested and validated that it fixes the issue in Octopus. (https://github.com/ceph/ceph/pull/47046#issuecomment-1259003645)

I've then imported the working patch into the 15.2.17 point release source, run a debuild, and then collected the debdiff.

Revision history for this message

Ubuntu Foundations Team Bug Bot (crichton) wrote on 2022-09-30:

The attachment "1978913.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags:

added: patch

Revision history for this message

nikhil kshirsagar (nkshirsagar) wrote on 2022-10-31:

LP1978913.patch Edit (137.1 KiB, text/plain)

nikhil kshirsagar (nkshirsagar) on 2022-10-31

description:

updated

Revision history for this message

Dan Hill (hillpd) wrote on 2022-11-01:

This issue is fixed upstream in Quincy 17.2.4 [0], and should be included in the upcoming Pacific 16.2.11 [1] point release.

Before we can pursue an SRU for Octopus, this issue needs to be fixed in Ubuntu across all subsequent releases.

Upstream ran into regressions with the previous fix [2]. We need to exercise caution and allow some time for problems to surface. The Pacific fix hasn't been released yet and will need at least a month to bake before it receives a point release SRU.

[0] https://github.com/ceph/ceph/pull/47688
[1] https://github.com/ceph/ceph/pull/47701
[2] https://<email address hidden>/message/TQB65XS43SO7P6LO6MICJHIROEJBWAAT/

tags:

added: sts

nikhil kshirsagar (nkshirsagar) on 2022-11-02

description:	updated
description:	updated
description:	updated

nikhil kshirsagar (nkshirsagar) on 2022-11-02

description:

updated

Revision history for this message

Ponnuvel Palaniyappan (pponnuvel) wrote on 2023-01-27:

Pacific 16.2.11 has been released now:

https://docs.ceph.com/en/latest/releases/pacific/#v16-2-11-pacific

which does include the fix for https://tracker.ceph.com/issues/53729.

So I think we can now proceed with SRUs for both Pacific and Octopus.

Revision history for this message

James Page (james-page) wrote on 2023-03-06:

Updates for wallaby and xena UCA pockets will be covered by 16.2.11 under bug 2009513.

Updates for Lunar, Kinetic, Jammy and the Yoga UCA pocket will be covered by 17.2.5 under bug 1998958.

Changed in ceph (Ubuntu):
status:	New → Fix Released
Changed in ceph (Ubuntu Jammy):
status:	New → Invalid
Changed in ceph (Ubuntu Kinetic):
status:	New → Invalid

Revision history for this message

James Page (james-page) wrote on 2023-03-06:

Marked bug tasks covered under new upstream point releases as Invalid; this obviously means that the point release targeting Ubuntu Jammy under bug 1998958 needs to release before the focal update included on this bug report.

Changed in cloud-archive:
status:	New → Invalid

Revision history for this message

James Page (james-page) wrote on 2023-03-06:

ceph 15.2.17-0ubuntu0.20.04.2 uploaded to focal for SRU team review.

Changed in ceph (Ubuntu Focal):
importance:	Undecided → High
status:	New → In Progress

Revision history for this message

James Page (james-page) wrote on 2023-03-06:

#10

Package will build here:

https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/4054

for focal.

Revision history for this message

Steve Langasek (vorlon) wrote on 2023-03-17: Please test proposed package

#11

Hello nikhil, or anyone else affected,

Accepted ceph into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ceph/15.2.17-0ubuntu0.20.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ceph (Ubuntu Focal):
status:	In Progress → Fix Committed
tags:	added: verification-needed verification-needed-focal

Revision history for this message

Corey Bryant (corey.bryant) wrote on 2023-03-22:

#12

Hello nikhil, or anyone else affected,

Accepted ceph into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

sudo add-apt-repository cloud-archive:ussuri-proposed
sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-ussuri-needed to verification-ussuri-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-ussuri-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags:

added: verification-ussuri-needed

Revision history for this message

nikhil kshirsagar (nkshirsagar) wrote on 2023-03-24 (last edit on 2023-03-27):

#13

Hi Corey, Steve,

I've been testing the proposed package for focal, and I do not see the exact results we expected because while the PG log dups are indeed cleared as expected, we do not see the logs we expect on the OSDs where the dups are cleared, (https://github.com/ceph/ceph/pull/47046/commits/aada08acde7a05ad769bb7a886ebcece628d522c#diff-b293fb673637ea53b5874bbb04f8f0638ca39cab009610e2cbc40a867bca4906L138)

I will change the tag to verification-failed-focal for now and continue testing and update the bug once I understand in detail what is happening to cause the lack of the expected log.

tags:

added: verification-failed-focal
removed: verification-needed-focal

Revision history for this message

nikhil kshirsagar (nkshirsagar) wrote on 2023-04-08:

#14

https://pastebin.canonical.com/p/pCxz78Ydjh/ has the detailed testing. dups on 2 osds are trimmed almost immediately, while the third (osd/1) has the dups remaining 8 hours later..

Revision history for this message

Steve Langasek (vorlon) wrote on 2023-04-12: Proposed package removed from archive

#15

The version of ceph in the proposed pocket of Focal that was purported to fix this bug report has been removed because one or more bugs that were to be fixed by the upload have failed verification and been in this state for more than 10 days.

Changed in ceph (Ubuntu Focal):
status:	Fix Committed → Confirmed
tags:	removed: verification-needed

Revision history for this message

nikhil kshirsagar (nkshirsagar) wrote on 2023-04-12:

#20

I have retested on Octopus, and Pacific, and I see the expected results. The missing factor in the earlier testing was simply enough IO for a higher chance to trigger IO to that particular PG in which we were injecting the dups. Once that is done, I see the expected results on Octopus too,

https://pastebin.canonical.com/p/Ksd6ZqxpDK/

I will remove the verification-failed-focal after confirming with Dongdong and Dan.

Revision history for this message

dongdong tao (taodd) wrote on 2023-04-25:

#21

Looks good to me

nikhil kshirsagar (nkshirsagar) on 2023-04-25

tags:	added: verification-done-focal removed: verification-failed-focal
tags:	added: verification-ussuri-done removed: verification-ussuri-needed

Revision history for this message

nikhil kshirsagar (nkshirsagar) wrote on 2023-04-25:

#22

I have removed the verification failed flags, and added the flags for the verification done. This SRU is good to proceed.

Revision history for this message

Dariusz Gadomski (dgadomski) wrote on 2023-04-26:

#23

As per Nikhil's comments I have reuploaded it for focal.

Revision history for this message

Mauricio Faria de Oliveira (mfo) wrote on 2023-05-10:

#24

lp1978913-focal-ceph-v4.debdiff Edit (35.1 KiB, text/plain)

tags:	removed: verification-done-focal verification-ussuri-done
tags:	added: se-sponsor-mfo

Revision history for this message

Mauricio Faria de Oliveira (mfo) wrote on 2023-05-10:

#25

Reviewed and uploaded to Focal on top of the recent security upload.

The previous upload has been rejected from the queue (thanks, Robie).

For documentation purposes:

- Original upload (15.2.17-0ubuntu0.20.04.2) was accepted into -proposed
(see [1] and comment #11), but was deleted later due to timing/issues
with the verification tags and test steps (comments #15, #20).

- Another upload (15.2.17-0ubuntu0.20.04.3) was made, but was trumped by
  the security upload (same version number); it also had some differences
  in the debdiff (e.g., DEP3 headers, changelog) to the original upload.
  It's now rejected from the queued.

- This upload (15.2.17-0ubuntu0.20.04.4) is on top of the security upload
(<version>.3), and has the _original_ changes (<version>.2), which have
already been approved once (this should help), plus adjustments to DEP3.

This has been build-tested locally on amd64, for time/performance reasons.

Attaching the debdiff (without the build-generated noise in src/test dir)
for reference purposes.

Thanks!

[1] https://launchpad.net/ubuntu/+source/ceph/15.2.17-0ubuntu0.20.04.2

Changed in ceph (Ubuntu Focal):
assignee:	nobody → nikhil kshirsagar (nkshirsagar)
status:	Confirmed → In Progress

Revision history for this message

Robie Basak (racb) wrote on 2023-05-10:

#26

Thanks Mauricio! I confirmed that this upload is identical to 15.2.17-0ubuntu0.20.04.4 (except for dep3 and the security rebase) and so I'm accepting on the basis of Steve's previous review, and that based on the comments since, it seems like the original patch was actually what was expected.

Changed in ceph (Ubuntu Focal):
status:	In Progress → Fix Committed
tags:	added: verification-needed verification-needed-focal

Revision history for this message

Robie Basak (racb) wrote on 2023-05-10: Please test proposed package

#27

Hello nikhil, or anyone else affected,

Accepted ceph into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ceph/15.2.17-0ubuntu0.20.04.4 in a few hours, and then in the -proposed repository.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message

nikhil kshirsagar (nkshirsagar) wrote on 2023-05-11:

#28

I've tested the new packages and they're working well. Here are the details of the testing - https://pastebin.canonical.com/p/CXvgbnS9w5/

I've changed the verification-needed-focal to verification-done-focal.

Regards,
Nikhil.

tags:

added: verification-done-focal
removed: verification-needed verification-needed-focal

Revision history for this message

Robie Basak (racb) wrote on 2023-05-11:

#29

Unsubscribing ~ubuntu-sponsors as I believe there's nothing left to sponsor.

Revision history for this message

Corey Bryant (corey.bryant) wrote on 2023-05-16:

#30

Hello nikhil, or anyone else affected,

Accepted ceph into ussuri-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

sudo add-apt-repository cloud-archive:ussuri-proposed
sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags:

added: verification-ussuri-needed

Revision history for this message

nikhil kshirsagar (nkshirsagar) wrote on 2023-05-17:

#31

Verfication of the ussuri-proposed CA packages is done - the patch works well - testing details are at https://pastebin.canonical.com/p/JCyQzft9sN/

I have added the verification-ussuri-done tag.

Regards,
Nikhil.

tags:

added: verification-ussuri-done
removed: verification-ussuri-needed

Revision history for this message

Andreas Hasenack (ahasenack) wrote on 2023-05-18:

#32

I verified the test results and am satisfied that they show the executed planned test case, and that the results are correct.

The package built correctly in all architectures and Ubuntu releases it was meant for.

There are no DEP8 regressions.

There is no SRU freeze ongoing at the moment.

There is no halted phasing on a previous update.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2023-05-18:

#33

This bug was fixed in the package ceph - 15.2.17-0ubuntu0.20.04.4

---------------
ceph (15.2.17-0ubuntu0.20.04.4) focal; urgency=medium

* d/p/bug1978913.patch: Cherry-pick upstream fix for on-line
trim of dups (LP: #1978913).

-- Nikhil Kshirsagar <email address hidden> Mon, 31 Oct 2022 05:45:04 +0000

Changed in ceph (Ubuntu Focal):
status:	Fix Committed → Fix Released

Revision history for this message

Andreas Hasenack (ahasenack) wrote on 2023-05-18: Update Released

#34

The verification of the Stable Release Update for ceph has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Dan Hill (hillpd) on 2023-09-14