[SRU] accept undecodable multi-block bluefs transactions on log

Bug #1945555 reported by gerald.yang
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ceph (Ubuntu)
In Progress
Undecided
gerald.yang
Bionic
In Progress
Undecided
gerald.yang

Bug Description

[Impact]
Multi-block transaction could fail during unexpected power down
in this case, it should be stop replaying this log instead of throwing unrecoverable error

[Test Case]
It's too difficult to simulate power outage during multi-block transaction on disk, so the way I test this patch is to simulate a multi-block transaction and trigger a decode error in
try {
      auto p = bl.cbegin();
      decode(t, p);
      seen_recs = true;
    }

Add the following line right after decode(t, p) to throw an error
throw buffer::malformed_input("error test");

According to the patch description https://github.com/ceph/ceph/pull/42830
this error will be considered as a normal bluefs log reply stop condition and will *not* prevent OSD from starting
After the error test is triggered, OSD can still be started normally

[Where problems could occur]
This upstream PR was created 2 months ago and Luminous was EOL upstream for a while, so no backport and test by upstream
In order to backport this commit, it also needs to backport some dependencies

[Other Info]
upstream tracker: https://tracker.ceph.com/issues/52079
PR: https://github.com/ceph/ceph/pull/42830

Changed in ceph (Ubuntu Bionic):
assignee: nobody → gerald.yang (gerald-yang-tw)
Changed in ceph (Ubuntu):
assignee: nobody → gerald.yang (gerald-yang-tw)
Changed in ceph (Ubuntu Bionic):
status: New → In Progress
Changed in ceph (Ubuntu):
status: New → In Progress
Revision history for this message
gerald.yang (gerald-yang-tw) wrote :

first patch

tags: added: sts sts-sru-needed verification-needed-bionic
Revision history for this message
gerald.yang (gerald-yang-tw) wrote :

second patch

Revision history for this message
gerald.yang (gerald-yang-tw) wrote :

This PR https://github.com/ceph/ceph/pull/42830 need an additional dependency (bionic1.patch)

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "bionic1.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Revision history for this message
gerald.yang (gerald-yang-tw) wrote :

Update test case

It's too difficult to simulate power outage during multi-block transaction on disk, so the way I test this patch is to simulate a multi-block transaction and trigger a decode error in
try {
      auto p = bl.cbegin();
      decode(t, p);
      seen_recs = true;
    }

Add the following line right after decode(t, p) to throw an error
throw buffer::malformed_input("error test");

According to the patch description https://github.com/ceph/ceph/pull/42830
this error will be considered as a normal bluefs log reply stop condition and will *not* prevent OSD from starting
After the error test is triggered, OSD can still be started normally

description: updated
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.