Inconsistent Verify: Invalid data - SHA1 hash mismatch : ErrorReturnCode=21
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Duplicity |
Confirmed
|
Medium
|
Unassigned |
Bug Description
Short story:
When I run duplicity in verify mode on a given set, it sometimes passes, and it sometimes fails with reporting:
Invalid data - SHA1 hash mismatch for file:
SOME_VOLUME.
Calculated hash: SOME_HASH
Manifest hash: ANOTHER_HASH
If I run it again, SOME_VOLUME gets through and ANOTHER_VOLUME reports the error. If I restart the server, then verify passes (sometimes).
Yes, the Hard drive where the volumes are stored is healthy (as reported by SMART) and when diff(ed) with its off-site mirror reports no differences.
The details:
I have several backup sets {Projects, Documents, Code, etc...} on Ubuntu 10.04 which I backup with duplicity 0.6.17 to an internal hard drive--the primary backup target.
Once a month I run duplicity in verify mode to make sure that my primary backup is still a backup. This month two sets failed. This was not a problem as have an offsite mirror that was synced 15 days ago. I ran `diff primary_backup mirror_backup` and the supposedly corrupt volumes were identical; as expected, the only differences where the backup volumes generated during the last 15 days. Weird, right? So I run duplicity verify on the primary_target and another volume-deeper in the set-reports a hash mismatch. Weirder, right? Now i run duplicity verify on the mirror: it passes.
Somehow my primary backup got messes up, right? I clobber it and replace it with its older mirror and run duplicity verify to make sure all is good. It fails, the primary_backup fails. I run another `diff primary_backup mirror_backup` and now they are identical. Ok, the drive that stores the primary_backup is failing, but wait: SMART long test says it is perfectly healthy and has no bad blocks or read/write errors in its entire history (I run these once a week). I delete duplicity cache, and temp files and rerun duplicity verify on primary_target and it fails: a new volume with bad hash. Now I go brute force and run duplicity verify 10 times and each time a different volume shows a bad hash. Sometime volume 3, somtimes volume 1000. What?
I now reboot the server--a few days have gone by and I've spent about 5 hrs looking at this--and I run duplicity verify on the primary_backup and it passess. What? This is the exact data that had failed 10 times? Seriously? Nevermind, this is what I expected and how things should be. Just to make sure all is good I rerun duplicity verify on all my sets and wait. What? Now two new sets report errors but the ones that originaly reported errors are fine.
Help! Can we trust verify? Why does it fail sometime and pass other times? Can it be the hard drive even if diff and SMART pass? Is there a bug in duplicity or gpg?
no longer affects: | deja-dup |
So this is the same issue from old bug 487720. I've seen several complaints since that fix landed about this issue rearing it's head again. It seems we didn't fully fix it back then.
See my comment https:/ /bugs.launchpad .net/duplicity/ +bug/487720/ comments/ 54 for some things you can do to recover from this.