LVM snapshot causes kernel memory corruption
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-source-2.6.22 (Ubuntu) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: linux-image-
I have seen strong evidence of kernel memory corruption some of the time when creating LVM snapshots. To reproduce this, I do:
rm -rf /tmp/test
mkdir /tmp/test
<put about 60MB of files into /tmp/test>
find /tmp/test -type f | xargs md5sum > /tmp/sum.pre
lvcreate --size 2G --snapshot /dev/dink/
find /tmp/test -type f | xargs md5sum > /tmp/sum.post
lvremove -f /dev/dink/
diff -u /tmp/sum.pre /tmp/sum.post
where /dev/dink/
The corruption is always a changed value of a single byte, always at offset 156 within a 1K block (different block each time). The incorrect value of the byte is always one less than the correct value. For example:
@@ -471431,7 +471431,7 @@
0731860: 4d46 6ae3 0252 6864 e634 15eb 7ac1 f0ee MFj..Rhd.4..z...
0731870: 9f2b 8d82 33e3 138b 31a2 8da5 4594 5648 .+..3...1...E.VH
0731880: 74fd 00e0 bc48 fe09 d557 f501 70a8 7dfd t....H...W..p.}.
-0731890: ea8f 5010 b963 e2ec 7b84 8ef7 e851 fdfa ..P..c..{....Q..
+0731890: ea8f 5010 b963 e2ec 7b84 8ef7 e751 fdfa ..P..c..{....Q..
07318a0: 6031 670b cd54 fe01 20d6 f3fb c662 dfc3 `1g..T.. ....b..
07318b0: 7605 acd2 1be6 3fee 54ff e15b bc60 77fa v.....?.T..[.`w.
07318c0: 368e 99f9 60a0 a1a2 fbdf ef0d 4bca a201 6...`.......K...
/tmp is located in a different volume group than the volume I am snapshotting.
Some version information:
root@linux-
Linux linux-build-10 2.6.22-14-server #1 SMP Thu Jan 31 23:57:25 UTC 2008 x86_6\
4 GNU/Linux
root@linux-
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 7.10
Release: 7.10
Codename: gutsy
root@linux-
Version: 2.02.26-1ubuntu4
root@linux-
PV /dev/sdb VG dink lvm2 [136.73 GB / 110.73 GB free]
PV /dev/sda5 VG LINUX-BUILD-
Total: 2 [204.85 GB] / in use: 2 [204.85 GB] / in no VG: 0 [0 ]
root@linux-
Reading all physical volumes. This may take a while...
Found volume group "dink" using metadata type lvm2
Found volume group "LINUX-
I ran into exactly this problem several months ago on Gutsy amd64, but had blamed it on hardware until ghudson reproduced it by doing the same operations. We are both working on the same project, but using completely different hardware. We’re now convinced that there is a kernel bug here.
I have since upgraded to Hardy amd64, and was not immediately able to reproduce again. On a different Gutsy i386 machine, I eventually got lvcreate to segfault and lvremove to hang forever, but that’s less interesting. I’ll keep working on this.
The problem was also reported to linux-lvm: www.redhat. com/archives/ linux-lvm/ 2008-February/ msg00074. html>.
<http://