fstrim corrupts ocfs2 filesystems when clustered
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
util-linux (Ubuntu) |
Expired
|
Undecided
|
Unassigned |
Bug Description
Recently upgraded from trusty to xenial and found that our ocfs2 filesystems, which are mounted across a number of nodes simultaneously, would become corrupt on the weekend:
[Sun Apr 9 06:46:35 2017] OCFS2: ERROR (device dm-2): ocfs2_validate_
[Sun Apr 9 06:46:35 2017] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted.
[Sun Apr 9 06:46:35 2017] OCFS2: File system is now read-only.
[Sun Apr 9 06:46:35 2017] (fstrim,
[Sun Apr 9 06:46:35 2017] OCFS2: ERROR (device dm-3): ocfs2_validate_
[Sun Apr 9 06:46:36 2017] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted.
[Sun Apr 9 06:46:36 2017] OCFS2: File system is now read-only.
[Sun Apr 9 06:46:36 2017] (fstrim,
We found the cron.weekly job which is pretty close to the timing:
47 6 * * 7 root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.weekly )
# cat /etc/cron.
#!/bin/sh
# trim all mounted file systems which support it
/sbin/fstrim --all || true
We have disabled this job across our servers running clustered ocfs2 filesystems. I think either the utility or the cronjob should ignore ocfs2 (gfs too?) filesystems.
Status changed to 'Confirmed' because the bug affects multiple users.