hfs: concurrent create/unlink can trip -EEXIST on non-existent files

Bug #2056451 reported by Colin Ian King
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Confirmed
Medium
linux (Ubuntu)
New
Undecided
Unassigned

Bug Description

Summary:

create hfs file system, loop-back mount it, run filename stressor with
stress-ng to exercise filename create/stat/unlink and we get unexpected -EEXIST errors.
This can be worked around by adding a sync() call after the unlink() to ensure
metadata is sync'd.

Kernel: 6.8.0-11-generic

test case:
sudo apt-get install hfsprogs

dd if=/dev/zero of=fs.img bs=1M count=2048
mkfs.hfs fs.img
sudo mount fs.img /mnt
sudo mkdir /mnt/x
sudo stress-ng --temp-path /mnt/x --filename 8 --filename-opts posix -t 20
stress-ng: info: [132412] setting to a 20 secs run per stressor
stress-ng: info: [132412] dispatching hogs: 8 filename
stress-ng: fail: [132424] filename: open failed on file of length 1 bytes, errno=17 (File exists)
stress-ng: fail: [132428] filename: open failed on file of length 20 bytes, errno=17 (File exists)
stress-ng: fail: [132423] filename: open failed on file of length 30 bytes, errno=17 (File exists)
stress-ng: fail: [132421] filename: open failed on file of length 30 bytes, errno=17 (File exists)
stress-ng: fail: [132428] filename: open failed on file of length 30 bytes, errno=17 (File exists)
stress-ng: fail: [132426] filename: open failed on file of length 23 bytes, errno=17 (File exists)
stress-ng: fail: [132425] filename: open failed on file of length 30 bytes, errno=17 (File exists)
stress-ng: fail: [132428] filename: open failed on file of length 1 bytes, errno=17 (File exists)
stress-ng: fail: [132423] filename: open failed on file of length 7 bytes, errno=17 (File exists)
stress-ng: fail: [132423] filename: open failed on file of length 11 bytes, errno=17 (File exists)
stress-ng: fail: [132426] filename: open failed on file of length 24 bytes, errno=17 (File exists)

adding a sync() call in the stress-ng stressor fixes the issue:

git diff
diff --git a/stress-filename.c b/stress-filename.c
index a64898fb1..b8266f91e 100644
--- a/stress-filename.c
+++ b/stress-filename.c
@@ -308,6 +308,7 @@ static void stress_filename_test(
                VOID_RET(int, shim_stat(filename, &buf));

                (void)shim_unlink(filename);
+ (void)sync();
        }

        /* exercise dcache lookup of non-existent filename */

sudo stress-ng --temp-path /mnt/x --filename 8 --filename-opts posix -t 20
stress-ng: info: [132461] setting to a 20 secs run per stressor
stress-ng: info: [132461] dispatching hogs: 8 filename
stress-ng: info: [132461] skipped: 0
stress-ng: info: [132461] passed: 8: filename (8)
stress-ng: info: [132461] failed: 0
stress-ng: info: [132461] metrics untrustworthy: 0
stress-ng: info: [132461] successful run completed in 20.05 secs

The sync should not be required by the way, I just added it to illustrate that there is a racy metadata sync issue in hfs.

summary: - hfs: concurrent create/unlink can trip -EEXIST on files
+ hfs: concurrent create/unlink can trip -EEXIST on non-existent files
Revision history for this message
In , colin.i.king (colin.i.king-linux-kernel-bugs) wrote :

Summary:

create hfs file system, loop-back mount it, run filename stressor with
stress-ng to exercise filename create/stat/unlink and we get unexpected -EEXIST errors.
This can be worked around by adding a sync() call after the unlink() to ensure
metadata is sync'd.

Kernel: 6.8.0-11-generic

test case:
sudo apt-get install hfsprogs

dd if=/dev/zero of=fs.img bs=1M count=2048
mkfs.hfs fs.img
sudo mount fs.img /mnt
sudo mkdir /mnt/x
sudo stress-ng --temp-path /mnt/x --filename 8 --filename-opts posix -t 20
stress-ng: info: [132412] setting to a 20 secs run per stressor
stress-ng: info: [132412] dispatching hogs: 8 filename
stress-ng: fail: [132424] filename: open failed on file of length 1 bytes, errno=17 (File exists)
stress-ng: fail: [132428] filename: open failed on file of length 20 bytes, errno=17 (File exists)
stress-ng: fail: [132423] filename: open failed on file of length 30 bytes, errno=17 (File exists)
stress-ng: fail: [132421] filename: open failed on file of length 30 bytes, errno=17 (File exists)
stress-ng: fail: [132428] filename: open failed on file of length 30 bytes, errno=17 (File exists)
stress-ng: fail: [132426] filename: open failed on file of length 23 bytes, errno=17 (File exists)
stress-ng: fail: [132425] filename: open failed on file of length 30 bytes, errno=17 (File exists)
stress-ng: fail: [132428] filename: open failed on file of length 1 bytes, errno=17 (File exists)
stress-ng: fail: [132423] filename: open failed on file of length 7 bytes, errno=17 (File exists)
stress-ng: fail: [132423] filename: open failed on file of length 11 bytes, errno=17 (File exists)
stress-ng: fail: [132426] filename: open failed on file of length 24 bytes, errno=17 (File exists)

adding a sync() call in the stress-ng stressor fixes the issue:

git diff
diff --git a/stress-filename.c b/stress-filename.c
index a64898fb1..b8266f91e 100644
--- a/stress-filename.c
+++ b/stress-filename.c
@@ -308,6 +308,7 @@ static void stress_filename_test(
                VOID_RET(int, shim_stat(filename, &buf));

                (void)shim_unlink(filename);
+ (void)sync();
        }

        /* exercise dcache lookup of non-existent filename */

sudo stress-ng --temp-path /mnt/x --filename 8 --filename-opts posix -t 20
stress-ng: info: [132461] setting to a 20 secs run per stressor
stress-ng: info: [132461] dispatching hogs: 8 filename
stress-ng: info: [132461] skipped: 0
stress-ng: info: [132461] passed: 8: filename (8)
stress-ng: info: [132461] failed: 0
stress-ng: info: [132461] metrics untrustworthy: 0
stress-ng: info: [132461] successful run completed in 20.05 secs

The sync should not be required by the way, I just added it to illustrate that there is a racy metadata sync issue in hfs.

Changed in linux:
importance: Unknown → Medium
status: Unknown → Confirmed
Revision history for this message
In , colin.i.king (colin.i.king-linux-kernel-bugs) wrote :

Course bisect, 4.4 to 5.15 OK, 6.5.0 onwards fail, so this looks like a regression

Revision history for this message
In , colin.i.king (colin.i.king-linux-kernel-bugs) wrote :

Correction, I meant to say: Course bisect, 5.15 OK, 6.5.0 onwards fail, so this looks like a regression

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.