Comment 63 for bug 1023755

Revision history for this message
Stefan Bader (smb) wrote :

Thiago, this plays a bit into the issue. But mainly this is a result of changes in the writeback code that is doing some complex throttling based on memory limits and estimated drive speeds. This has some issues with the used setup because we now got two backing devices that are not independent and require further memory to proceed.
When writing the dd onto the snapshot, every snapshot chunk number of blocks will create an exception. That means for the first block touching a new chunk, the chunk gets copied to the snapshot COW storage and the mapping table (on the same device) needs to be updated. Then again, when being processed by the loop thread there will be related IO generated (access time updates, journal).
This can be observed even with deadline elevator and more recent kernel versions. 3.8 seemed much better but still had it. Though this could just be a different timing while filling buffers. So maybe the dd process gets normally blocked sooner. But its hard to find specific changes there.
Right now, I am trying to get to a stage that would make Precise at least usable. Right now there is one change that seems to at least keep the whole system from hanging. But still the dd process will not complete because the loop0 thread that is responsible for moving the requests from the LV side to the fs is blocked because the backing dev is assumed to have no bandwidth, but it has no bandwidth because it never gets something to do...