Comment 54 for bug 1277589

Revision history for this message
Barry Warsaw (barry) wrote : Re: [Bug 1277589] Re: Better protection against concurrent access

On Mar 03, 2014, at 08:44 AM, Alan Pope ㋛ wrote:

>I have triggered it again on 216 upgrading to 217.
>
>To be clear, I'm *not* upgrading, but starting the upgrade then pressing
>back, then going back in later, and then the error appears. Here's a
>video:-
>
>https://www.youtube.com/watch?v=YmD6cGYvIAI

One difference from the previous incarnation is that you're doing manual
downloads. Your video actually helped quite a bit as I can reproduce this
fairly consistently now by flashing to 215 and upgrading to whatever is the
latest image.

Log file analysis leads me to think there's a race condition between u-d-m
doing its atomic renames and it sending the 'finished' signal for the group
download. If I put some logging right after the finished signal is received,
and I list the directories containing the destination files, I see that they
have .tmp.tmp suffixes. The first .tmp is put there by s-i (which still has
its own atomic-rename workaround), but the second .tmp is put there by u-d-m
for *its* atomic rename operation.

It should not be possible for s-i to see the .tmp.tmp files. This can only
happen if it's seeing the finished signal before u-d-m does its atomic
rename.

I tried the following experiment: at the point where the finished signal is
received, I log the data as explained above, and then I sleep(2) before
continuing on. With the sleep in there, it doesn't crash for me.

This isn't definitive proof of what's going on, but it seems plausible.
Manuel is investigating u-d-m, and then he'll provide a package for me to test
on my device.