Comment 4 for bug 1912224

Revision history for this message
Max Reitz (xanclic) wrote :

Hi,

As I said on IRC, I’m not sure this additional block_status argument would be good, because the hole offset needs to be reset when the file is written to (at least on zero writes; if we additionally stored a data offset, then that would need to be reset on all writes). Technically, mirror can do that, because all writes should go through it, but it doesn’t seem the right place to cache it there. Furthermore, depending on how often writes occur, this cache may end up not doing much.

We could place it in file-posix instead (i.e., it would store the last offset where SEEK_HOLE/DATA was invoked and the last offset that they returned, so if a block_status request comes in in that range, it can be answered without doing a SEEK_HOLE/DATA), but that might suffer from the same problem of having to invalidate the cache too often.

Though OTOH, as I also admitted on IRC, perhaps we should just try and see what happens.

As an afterthought, it might be cool to have file-posix use bitmaps to cache this status. In the simplest case, we could have one bitmaps that tells whether the block status is known (0 = known, 1 = unknown); this bitmap is active, so that writes would automatically invalidate the affected blocks. And then we have another bitmap that for the blocks of known status tells us whether they contain data or only zeroes. This solution wouldn’t suffer from a complete cache invalidation on every write.

(Fine-tuning it, we could instead have both bitmaps be inactive, so that file-posix itself needs to update them on writes, so that all writes would give their respective blocks a known status, with data writes making them contain data, and zero writes making them contain zeroes.)

(Perhaps we could consider offering this as a GSoC project?)