Prevent race condition when printing Inode in ll_sync_inode

Bug #2078906 reported by Chengen Du
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
ceph (Ubuntu)
In Progress
Medium
Chengen Du
Focal
Won't Fix
Undecided
Chengen Du
Jammy
Fix Released
Medium
Chengen Du
Noble
In Progress
Medium
Chengen Du
Oracular
Won't Fix
Medium
Chengen Du
Plucky
In Progress
Undecided
Unassigned

Bug Description

[Impact]
In the ll_sync_inode function, the entire Inode structure is printed without holding a lock, which may lead to the following core trace:
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140705682900544) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140705682900544) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140705682900544, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007ffa92094476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00007ffa9207a7f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007ffa910783c3 in ceph::__ceph_assert_fail (assertion=<optimized out>, file=<optimized out>, line=<optimized out>, func=<optimized out>) at ./src/common/assert.cc:75
#6 0x00007ffa91078525 in ceph::__ceph_assert_fail (ctx=...) at ./src/common/assert.cc:80
#7 0x00007ffa7049f602 in xlist<ObjectCacher::Object*>::size (this=0x7ffa20734638, this=0x7ffa20734638) at ./src/include/xlist.h:87
#8 operator<< (os=..., out=warning: RTTI symbol not found for class 'StackStringStream<4096ul>'
...) at ./src/osdc/ObjectCacher.h:760
#9 operator<< (out=warning: RTTI symbol not found for class 'StackStringStream<4096ul>'
..., in=...) at ./src/client/Inode.cc:80
#10 0x00007ffa7045545f in Client::ll_sync_inode (this=0x55958b8a5c60, in=in@entry=0x7ffa20734270, syncdataonly=syncdataonly@entry=false) at ./src/client/Client.cc:14717
#11 0x00007ffa703d0f75 in ceph_ll_sync_inode (cmount=cmount@entry=0x55958b0bd0d0, in=in@entry=0x7ffa20734270, syncdataonly=syncdataonly@entry=0) at ./src/libcephfs.cc:1865
#12 0x00007ffa9050ddc5 in fsal_ceph_ll_setattr (creds=<optimized out>, mask=<optimized out>, stx=0x7ff8983f25a0, i=<optimized out>, cmount=<optimized out>)
    at ./src/FSAL/FSAL_CEPH/statx_compat.h:209
#13 ceph_fsal_setattr2 (obj_hdl=0x7fecc8fefbe0, bypass=<optimized out>, state=<optimized out>, attrib_set=0x7ff8983f2830) at ./src/FSAL/FSAL_CEPH/handle.c:2410
#14 0x00007ffa92371da0 in mdcache_setattr2 (obj_hdl=0x7fecc9e98778, bypass=<optimized out>, state=0x7fef0d64c9b0, attrs=0x7ff8983f2830)
    at ../FSAL/Stackable_FSALs/FSAL_MDCACHE/./src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1012
#15 0x00007ffa922b2bbc in fsal_setattr (obj=0x7fecc9e98778, bypass=<optimized out>, state=0x7fef0d64c9b0, attr=0x7ff8983f2830) at ./src/FSAL/fsal_helper.c:573
#16 0x00007ffa9234c7bd in nfs4_op_setattr (op=0x7fecad7ac510, data=0x7fecac314a10, resp=0x7fecad1be200) at ../Protocols/NFS/./src/Protocols/NFS/nfs4_op_setattr.c:212
#17 0x00007ffa9232e413 in process_one_op (data=data@entry=0x7fecac314a10, status=status@entry=0x7ff8983f2a2c) at ../Protocols/NFS/./src/Protocols/NFS/nfs4_Compound.c:920
#18 0x00007ffa9232f9e0 in nfs4_Compound (arg=<optimized out>, req=0x7fecad491620, res=0x7fecac054580) at ../Protocols/NFS/./src/Protocols/NFS/nfs4_Compound.c:1327
#19 0x00007ffa922cb0ff in nfs_rpc_process_request (reqdata=0x7fecad491620) at ./src/MainNFSD/nfs_worker_thread.c:1508
#20 0x00007ffa92029be7 in svc_request (xprt=0x7fed640504d0, xdrs=<optimized out>) at ./src/svc_rqst.c:1202
#21 0x00007ffa9202df9a in svc_rqst_xprt_task_recv (wpe=<optimized out>) at ./src/svc_rqst.c:1183
#22 0x00007ffa9203344d in svc_rqst_epoll_loop (wpe=0x559594308e60) at ./src/svc_rqst.c:1564
#23 0x00007ffa920389e1 in work_pool_thread (arg=0x7feeb802ea10) at ./src/work_pool.c:184
#24 0x00007ffa920e6b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#25 0x00007ffa92178a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Upon further analysis of the call trace using GDB, both the _front and _back member variables in xlist<ObjectCacher::Object*> are set to zero, yet an assertion failure is still triggered.
(gdb) frame 7
#7 0x00007ffa7049f602 in xlist<ObjectCacher::Object*>::size (this=0x7ffa20734638, this=0x7ffa20734638) at ./src/include/xlist.h:87
87 ./src/include/xlist.h: No such file or directory.
(gdb) p *this
$1 = {_front = 0x0, _back = 0x0, _size = 0}
(gdb) frame 6
#6 0x00007ffa91078525 in ceph::__ceph_assert_fail (ctx=...) at ./src/common/assert.cc:80
80 ./src/common/assert.cc: No such file or directory.
(gdb) p ctx
$2 = (const ceph::assert_data &) @0x7ffa70587900: {assertion = 0x7ffa70530598 "(bool)_front == (bool)_size", file = 0x7ffa705305b4 "./src/include/xlist.h", line = 87,
  function = 0x7ffa7053b410 "size_t xlist<T>::size() const [with T = ObjectCacher::Object*; size_t = long unsigned int]"}

A race condition occurred, leading to abnormal behavior in the judgment.

[Fix]
It may not be necessary to print the entire Inode structure; simply printing the inode number should be sufficient.
There is an upstream commit that fixes this issue:
commit 2b78a5b3147d4e97be332ca88d286aec0ce44dc3
Author: Chengen Du <email address hidden>
Date: Mon Aug 12 18:17:37 2024 +0800

    client: Prevent race condition when printing Inode in ll_sync_inode

    In the ll_sync_inode function, the entire Inode structure is printed without
    holding a lock. This can lead to a race condition when evaluating the assertion
    in xlist<ObjectCacher::Object*>::size(), resulting in abnormal behavior.

    Fixes: https://tracker.ceph.com/issues/67491

    Co-authored-by: dongdong tao <email address hidden>
    Signed-off-by: Chengen Du <email address hidden>

[Test Plan]
The race condition might be challenging to reproduce, but we can test to ensure that the normal call path functions correctly.
1. Create a Manila share and mount it locally
openstack share type create nfs_share_type False --description "NFS share type"
openstack share create --share-type nfs_share_type --name my_share NFS 1
openstack share list
openstack share access create my_share ip XX.XX.XX.XX/XX
openstack share show my_share
sudo mount -t nfs <-export_location_path-> <-mountpoint->
2. Create a file and change its permissions, ensuring that all functions work correctly without any errors
touch test
chmod 755 test

[Where problems could occur]
The patch only modifies the log to prevent a race condition.
However, if there are any issues with the patch, it could disrupt Ceph's ll_sync_inode functionality, which is utilized when setting attributes on a manila share via the NFS protocol.

Revision history for this message
Chengen Du (chengendu) wrote :

The problematic code is already wrapped by the lock in Focal, so no fix is needed.

Changed in ceph (Ubuntu Focal):
status: New → Won't Fix
assignee: nobody → Chengen Du (chengendu)
Revision history for this message
Chengen Du (chengendu) wrote :

Debdiff for Jammy

Revision history for this message
Chengen Du (chengendu) wrote :

Debdiff for Noble

Revision history for this message
Chengen Du (chengendu) wrote :

Debdiff for Oracular

Changed in ceph (Ubuntu Jammy):
assignee: nobody → Chengen Du (chengendu)
Changed in ceph (Ubuntu Noble):
assignee: nobody → Chengen Du (chengendu)
Changed in ceph (Ubuntu Oracular):
assignee: nobody → Chengen Du (chengendu)
Changed in ceph (Ubuntu Jammy):
status: New → In Progress
Changed in ceph (Ubuntu Noble):
status: New → In Progress
Changed in ceph (Ubuntu Oracular):
status: New → In Progress
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "lp2078906-ceph-jammy.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote (last edit ):

Hi Chengen,

Thanks for the bug report and congrats on the upstream fix!

In Ubuntu, Ceph is owned by Openstack and Ceph Engineering team.

The usual policy for Ubuntu stable releases for Ceph (AFAI know/recall) is to prefer upstream point releases (so as to at least save on testing time with all bug-fixes/changes together), but I'm aware there are exceptions, especially more recently when upstream point releases and their adoption into Ubuntu doesn't happen as often.

So, please, let's first get a confirmation from James Page on which approch they would prefer:

A) Upstream backports (i.e., the master tracker has backports trackers open for Quincy/Reef/Squid; and Quincy would meet the needs/bug-tasks down to Jammy), then wait for and SRU the next upstream point releases;

B) Independently of upstream backports/point releases, SRU this individual patch.

I'll ask James to review this for input.

Thanks,
Mauricio

Changed in ceph (Ubuntu Oracular):
status: In Progress → Incomplete
Changed in ceph (Ubuntu Noble):
status: In Progress → Incomplete
Changed in ceph (Ubuntu Jammy):
status: In Progress → Incomplete
Changed in ceph (Ubuntu Oracular):
importance: Undecided → Medium
Changed in ceph (Ubuntu Noble):
importance: Undecided → Medium
Changed in ceph (Ubuntu Jammy):
importance: Undecided → Medium
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Chengen,

> [Test Plan]
> The race condition might be challenging to reproduce, [...]

You may want to look at/try synchronizing threads with GDB.
Please see bug 1994002 comment 21 for an example with QEMU.

Hope this helps,
Mauricio

Revision history for this message
Chengen Du (chengendu) wrote :

Kindly provide an update on this bug if there are any developments in progress.

Revision history for this message
Dan Bungert (dbungert) wrote :

Untagging sponsors, please resubscribe when ready.

Revision history for this message
Chengen Du (chengendu) wrote :

A quick status update: we're currently waiting for the upstream backport work to be completed.
https://github.com/ceph/ceph/pull/59621
https://github.com/ceph/ceph/pull/59620
https://github.com/ceph/ceph/pull/59619

Revision history for this message
Ponnuvel Palaniyappan (pponnuvel) wrote :

https://github.com/ceph/ceph/pull/59621
https://github.com/ceph/ceph/pull/59620

have been merged already for Squid and Reef respectively.

https://github.com/ceph/ceph/pull/59619 (for Quincy) has been rejected by Ceph upstream because it has reached EOL.

But we'll backport the fix in Ubuntu packages - currently planned to be done as part of 17.2.9 point release which is tracked under https://bugs.launchpad.net/bugs/2100625.

Revision history for this message
Ural Tunaboyu (uralt) wrote :

Ubuntu 24.10 (Oracular Oriole) has reached end of life, so this bug will not be fixed for that specific release.

Changed in ceph (Ubuntu Oracular):
status: Incomplete → Won't Fix
Revision history for this message
Chris Halse Rogers (raof) wrote :

Noble, Plucky, and Questing will be fixed with 19.2.3 - currently being processed through https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2121484

Changed in ceph (Ubuntu):
status: Incomplete → In Progress
Changed in ceph (Ubuntu Noble):
status: Incomplete → In Progress
Revision history for this message
Chris Halse Rogers (raof) wrote : Please test proposed package

Hello Chengen, or anyone else affected,

Accepted ceph into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ceph/17.2.9-0ubuntu0.22.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ceph (Ubuntu Jammy):
status: Incomplete → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Chris Halse Rogers (raof) wrote :

Note that while this has been accepted to Jammy and will shortly be ready for testing, releasing from jammy-proposed will need to wait until the subsequent releases (Noble, Plucky, Questing) also have a fix.

Changed in ceph (Ubuntu Plucky):
status: New → In Progress
Revision history for this message
Seyeong Kim (seyeongkim) wrote (last edit ):
Download full text (12.6 KiB)

As the symptom was not clearly observable in the operating environment, the following verification was performed.

# juju status
Model Controller Cloud/Region Version SLA Timestamp
ceph maas-default maas/default 2.9.52 unsupported 12:10:29Z

App Version Status Scale Charm Channel Rev Exposed Message
ceph-fs 17.2.7 active 1 ceph-fs quincy/stable 194 no Unit is ready
ceph-mon 17.2.7 active 1 ceph-mon quincy/stable 388 no Unit is ready and clustered
ceph-osd 17.2.7 active 3 ceph-osd quincy/stable 753 no Unit is ready (1 OSD)
ubuntu active 3 ubuntu stable 26 no

Unit Workload Agent Machine Public address Ports Message
ceph-fs/0* active idle 7 10.0.0.142 Unit is ready
ceph-mon/0* active idle 3 10.0.0.138 Unit is ready and clustered
ceph-osd/0 active idle 0 10.0.0.128 Unit is ready (1 OSD)
ceph-osd/1* active idle 1 10.0.0.129 Unit is ready (1 OSD)
ceph-osd/2 active idle 2 10.0.0.130 Unit is ready (1 OSD)
ubuntu/0* active idle 4 10.0.0.139
ubuntu/1 active idle 5 10.0.0.140
ubuntu/2 active idle 6 10.0.0.141

Machine State Address Inst id Series AZ Message
0 started 10.0.0.128 node-28 jammy default Deployed
1 started 10.0.0.129 node-29 jammy default Deployed
2 started 10.0.0.130 node-30 jammy default Deployed
3 started 10.0.0.138 node-38 jammy default Deployed
4 started 10.0.0.139 node-39 jammy default Deployed
5 started 10.0.0.140 node-40 jammy default Deployed
6 started 10.0.0.141 node-41 jammy default Deployed
7 started 10.0.0.142 node-42 jammy default Deployed

The ceph-mon/0 unit was running both the Ceph monitor and an NFS-Ganesha setup.

On ubuntu/0,1,2, the NFS-Ganesha export was mounted at /mnt/nfs.

root@node-39:/mnt/nfs# mount
...
10.0.0.138:/cephfs on /mnt/nfs type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,har
d,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.139,local_lock=none,addr=10.0.0.138)
root@node-39:/mnt/nfs# ls
stress.sh stressdir test

After running a large number of operations (touch, chmod, and others [1]), ubuntu/0,1,2 became unresponsive after several minutes. The `ls` command no longer worked under /mnt/nfs on any of the nodes. I had to reboot 3 nodes.

I then upgraded Ceph using the proposed packages. Because ceph-mon and NFS-Ganesha were running on the same node, I upgraded all Ceph components together.

I then ran the same script for over 15 minutes, and it worked without issues.

I believe this can be considered a proper verification.

ce...

tags: added: verification-done verification-done-jammy
removed: verification-needed verification-needed-jammy
Revision history for this message
Julian Andres Klode (juliank) wrote : Update Released

The verification of the Stable Release Update for ceph has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ceph - 17.2.9-0ubuntu0.22.04.1

---------------
ceph (17.2.9-0ubuntu0.22.04.1) jammy; urgency=medium

  * New upstream point release (LP: #2100625):
    - d/p/lp2078906-inode-print.patch: Prevent race condition when
      printing inode in ll_sync_inode. (LP: #2078906)
    - d/p/fix-lvm-devices.patch: Drop, included in release.
    - d/p/fix-snapmapper-ftbfs.patch: Moved from hunk in
      d/p/fix-lvm-devices.patch to its own file.

 -- Luciano Lo Giudice <email address hidden> Thu, 21 Aug 2025 11:33:49 +0000

Changed in ceph (Ubuntu Jammy):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.