Comment 5 for bug 1767992

Revision history for this message
na (utsaxc) wrote :

I can confirm this affected me as well. My RAID-10 is on a separate filesystem from the Operating System's SSD -- I have the RAID-10 mounted at '/raid10' on my server. My application is installed on the non-mdadm SSD at '/opt' but it reads/writes/executes files from the '/raid10' filesystem (made up of 6x8TB drives with 1 spare).

During a recovery from a failed drive, it would periodically run into the same issue mentioned here while performing write operations and was less likely to occur during read operations but it still did happen. I noticed I'd have to reboot about every 30-60min due to this hang-up that stopped the rebuild from continuing -- once the server was back online, it would continue.

However, I got fed up with this because the rebuild/resync is supposed to take approximately 9ish hours and it was only 29% complete so on the last reboot I stopped my application and unmounted the RAID-10 (umount /raid10). Once I did that, this continue rebuilding through the night without issue and completed the remaining 71%.

... So it seems like you shouldn't interact with the RAID in the latest kernel/xfs/mdadm on Ubuntu 18.04. Here's the current versions I'm running:

==========
root@server:~# uname -r
4.15.0-36-generic

root@server:~# dpkg -l | awk '/mdadm/ || /xfsprog/ {print $2,$3}' | column -t
mdadm 4.1~rc1-3~ubuntu18.04.1
xfsprogs 4.9.0+nmu1ubuntu2
==========

Previously with Ubuntu 16.04 with the 4.4 kernel and latest mdadm/xfsprogs for 16.04 I didn't have this issue.