Comment 5 for bug 1587686

Revision history for this message
LeetMiniWheat (white-phoenix) wrote :

Thanks for looking into this, I'll test that build tonight but I assume I'll see similar results.

In my previous tests with this commit applied I still occasionally ran into the error (albeit far less often, sometimes not at all, or rather quickly) and also some other traces regarding pthreads (current 0.6.5-release sort of incorrectly uses pthreads and ASSERTs and other things at the moment from what I understand, and there's a lot of upstream work being done on it in master). lowlatency kernels seem to fail faster on it too which is a bit confusing. I still think there's a lot of corner case bugs in ZFS and ztest. Fully fixing ztest/ZFS/SPL for 0.5.6-release would likely be way too invasive to backport, and it looks like bandaids such as this only prolong the inevitable failure.

After much cherry picking, trial & error, and commenting on some upsteam commits I don't believe ztest was intended for end users or as a reliable long-term stress tool - nor does it get as much developer attention for releases since it's not a real-world test. One upstream developer/maintainer even commented that ztest is intended for ZFS developers (implying end users shouldn't be using it?) - which makes me question why it's even included in zfsutils-linux if it's fundamentally broken on release versions. If it's this unreliable then it will create many more false positives for others looking to test the stability of Ubuntu's ZFS, resulting in people thikning ZFS or Ubuntu's ZFS implementation is broken when in fact it may be perfectly fine under real world workloads.

ztest still works as a short term test for ZFS functions though and this commit probably did belong in release (they've marked it for 0.6.5.8 milestone) but as mentioned above there's many other outstanding issues this tool brings to light (whether falsely positive or not).

On a side note, I'd be interested in seeing ZFS ran under AFL (AFL Filesystem fuzzing, a tool which recently discovered many upstream bugs in existing kernel filesystems) since many corner case bugs were found in current filesystems with fixes incoming for backport to 4.4.13, 4.5.7, and 4.6.2 however LinuxFoundation's Oracle AFL event/presentation only included the most commonly used in-kernel filesystems.

Sorry if this is noise, but hopefully this will bring more awareness to this issue which may not even be an issue, the correct fix may be to move ztest to another (dev or debug?) package.