fsck.xfs doesn't call xfs_repair in force mode from systemd unit during the system startup

Bug #2071474 reported by Marcin Wilk
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
xfsprogs (Ubuntu)
In Progress
Undecided
gerald.yang

Bug Description

Ubuntu 20.04.6 LTS
xfsprogs 5.3.0-1ubuntu2

According to the fsck.xfs(8) man page [1] an administrator can force xfs_repair on every boot by adding the "fsck.mode=force" to the kernel boot options. But it doesn't work. The culprit is this check in the xfs_fsck.sh file [2]. This is one of the checks that is supposed to prevent running the fsck.xfs tool in the force mode from the interactive session but apparently it also blocks it's execution from the systemd unit. According to the systemd-fsck@.service man page [3] a separate service is instantiated for each device that is configured for file system checking. And that service calls /lib/systemd/systemd-fsck (/usr/lib/systemd/systemd-fsck) which is a part of the systemd package. This in turn is built from the fsck/fsck.c source [4]. The tool calls fsck.xfs with -f if the kernel command line fsck.mode=force is present [5][6].

Steps to reproduce:
1. install Ubuntu Focal with two disks (one for the system, the other for test xfs partition)
2. create partition and xfs file system on the second drive (in my case it is /dev/vdb (the disk), /dev/vdb1 (the xfs partition))
3. create mount point dir (ie, /data) and add appropriate line to the /etc/fstab, ie:
/dev/vdb1 /data xfs defaults 0 2
4. sudo mount --all to test the above, next: sudo umount /data
5. break the xfs fs on /dev/vdb1:
sudo xfs_db -x -c blockget -c "blocktrash -s 512109 -n 1000" /dev/vdb1
6. verify it's broken:
sudo mount --all
7. add the "fsck.mode=force" to the kernel boot option in grub (/etc/default/grub):
GRUB_CMDLINE_LINUX_DEFAULT="console=tty1 console=ttyS0 fsck.mode=force"
8. sudo update-grub
9. sudo reboot; # watch console during the startup for messages similar to the following:

         Starting File System Check on /dev/vdb1...
[ OK ] Started File System Check Daemon to report status.
[ OK ] Finished File System Check on /dev/vdb1.
         Mounting /data...
[ 2.592015] XFS (vdb1): Metadata CRC error detected at xfs_inobt_read_verify+0x1a/0xc0 [xfs], xfs_inobt block 0x8
[ 2.593023] XFS (vdb1): Unmount and run xfs_repair
[ 2.593579] XFS (vdb1): First 128 bytes of corrupted metadata buffer:
[ 2.594195] 00000000: b9 be bd cc 0f 00 00 01 ff ff ff ff ff ff ff ff ................
[ 2.596159] 00000010: 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 00 ................
[ 2.596929] 00000020: 70 ef 5b b8 a3 e5 4e 12 a0 f3 45 c9 0a 20 11 3f p.[...N...E.. .?
[ 2.597807] 00000030: 00 00 00 00 28 a7 30 1d 00 00 00 80 00 00 40 3d ....(.0.......@=
[ 2.598661] 00000040: ff ff ff ff ff ff ff f8 00 00 00 00 00 00 00 00 ................
[ 2.599451] 00000050: 00 00 00 00 00 00 00 00 00 c0 ff 07 00 00 00 00 ................
[ 2.600244] 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 2.601034] 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
[ 2.601877] XFS (vdb1): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x8 len 8 error 74
[FAILED] Failed to mount /data.

10. comment out lines 61-63 in the /sbin/fsck.xfs:

diff /sbin/fsck.xfs ~/fsck.xfs
61,63c61,63
< #if [ -n "$PS1" -o -t 0 ]; then
< # FORCE=false
< #fi
---
> if [ -n "$PS1" -o -t 0 ]; then
> FORCE=false
> fi

11. repeat step 9. (reboot). The system will repair /dev/vdb1 and mount it automatically:

         Starting File System Check on /dev/vdb1...
[ OK ] Started File System Check Daemon to report status.
[ OK ] Mounted Mount unit for core20, revision 2318.
[ OK ] Mounted Mount unit for lxd, revision 24061.
[ OK ] Mounted Mount unit for snapd, revision 21759.
[ OK ] Finished File System Check on /dev/vdb1.
[ OK ] Reached target Mounted snaps.
         Mounting /data...
[ OK ] Mounted /data.

journalctl proves that xfs_repair run on the disk:
date; journalctl --no-pager -u <email address hidden> --since "5 minutes ago"
Fri Jun 28 14:37:19 UTC 2024
-- Logs begin at Fri 2024-06-28 07:53:49 UTC, end at Fri 2024-06-28 14:36:24 UTC. --
Jun 28 14:34:12 xfs systemd[1]: <email address hidden>: Succeeded.
Jun 28 14:34:12 xfs systemd[1]: Stopped File System Check on /dev/vdb1.
-- Reboot --
Jun 28 14:34:28 xfs systemd[1]: Starting File System Check on /dev/vdb1...
Jun 28 14:34:28 xfs systemd-fsck[564]: Phase 1 - find and verify superblock...
Jun 28 14:34:28 xfs systemd-fsck[564]: Phase 2 - using internal log
Jun 28 14:34:28 xfs systemd-fsck[564]: - zero log...
Jun 28 14:34:28 xfs systemd-fsck[564]: - scan filesystem freespace and inode maps...
Jun 28 14:34:28 xfs systemd-fsck[564]: Metadata CRC error detected at 0x564c48e69d0d, xfs_bnobt block 0x20/0x1000
Jun 28 14:34:28 xfs systemd-fsck[564]: btree block 0/4 is suspect, error -74
.....
Jun 28 14:34:28 xfs systemd-fsck[564]: UUID mismatch on inode 191
Jun 28 14:34:28 xfs systemd-fsck[564]: cleared inode 191
Jun 28 14:34:28 xfs systemd-fsck[564]: - agno = 1
Jun 28 14:34:28 xfs systemd-fsck[564]: - agno = 2
Jun 28 14:34:28 xfs systemd-fsck[564]: - agno = 3
Jun 28 14:34:28 xfs systemd-fsck[564]: - process newly discovered inodes...
Jun 28 14:34:28 xfs systemd-fsck[564]: Phase 4 - check for duplicate blocks...
Jun 28 14:34:28 xfs systemd-fsck[564]: - setting up duplicate extent list...
Jun 28 14:34:28 xfs systemd-fsck[564]: - check for inodes claiming duplicate blocks...
Jun 28 14:34:28 xfs systemd-fsck[564]: - agno = 0
Jun 28 14:34:28 xfs systemd-fsck[564]: Cleared next_unlinked in inode 128
Jun 28 14:34:28 xfs systemd-fsck[564]: - agno = 1
Jun 28 14:34:28 xfs systemd-fsck[564]: - agno = 2
Jun 28 14:34:28 xfs systemd-fsck[564]: - agno = 3
Jun 28 14:34:28 xfs systemd-fsck[564]: Phase 5 - rebuild AG headers and trees...
Jun 28 14:34:28 xfs systemd-fsck[564]: - reset superblock...
Jun 28 14:34:28 xfs systemd-fsck[564]: Phase 6 - check inode connectivity...
Jun 28 14:34:28 xfs systemd-fsck[564]: reinitializing root directory
Jun 28 14:34:28 xfs systemd-fsck[564]: reinitializing realtime bitmap inode
Jun 28 14:34:28 xfs systemd-fsck[564]: reinitializing realtime summary inode
Jun 28 14:34:28 xfs systemd-fsck[564]: - resetting contents of realtime bitmap and summary inodes
Jun 28 14:34:28 xfs systemd-fsck[564]: - traversing filesystem ...
Jun 28 14:34:28 xfs systemd-fsck[564]: - traversal finished ...
Jun 28 14:34:28 xfs systemd-fsck[564]: - moving disconnected inodes to lost+found ...
Jun 28 14:34:28 xfs systemd-fsck[564]: Phase 7 - verify and correct link counts...
Jun 28 14:34:28 xfs systemd-fsck[564]: done
Jun 28 14:34:28 xfs systemd[1]: Finished File System Check on /dev/vdb1.

[1] https://manpages.ubuntu.com/manpages/focal/man8/fsck.xfs.8.html
[2] https://git.launchpad.net/ubuntu/+source/xfsprogs/tree/fsck/xfs_fsck.sh?h=ubuntu/focal#n61
[3] https://manpages.ubuntu.com/manpages/focal/man8/systemd-fsck-root.service.8.html
[4] https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/tree/meson.build?h=ubuntu-focal#n2573
[5] https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/tree/src/fsck/fsck.c?h=ubuntu-focal#n84
[6] https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/tree/src/fsck/fsck.c?h=ubuntu-focal#n387

Marcin

Marcin Wilk (wilkmarcin)
description: updated
Dan Hill (hillpd)
Changed in xfsprogs (Ubuntu):
status: New → Triaged
Dan Hill (hillpd)
Changed in xfsprogs (Ubuntu):
status: Triaged → Confirmed
Changed in xfsprogs (Ubuntu):
assignee: nobody → gerald.yang (gerald-yang-tw)
status: Confirmed → In Progress
Revision history for this message
gerald.yang (gerald-yang-tw) wrote :

I've send a patch to linux-xfs mailing list:
https://marc.info/?l=linux-xfs&m=172353406613913&w=2

Once it's accepted and merge into upstream xfsprogs, I will SRU it back to ubuntu xfsprogs

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.