Hi,everyone:
ceph - jewel 10.2.6
We use ceph as openstack storage backend, and use ceph snapshot.But recently we met a problem about snapshot usage.
[root@flexhcs_osd_2 /]# rbd du volumes/72786dd9-ee5d-4dab-b2c6-63bb58ee2d54
NAME PROVISIONED USED
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-ef4c937f-c587-4039-9c66-16819c5351fa 2000G 14380M
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-b0c7d4af-6fe7-48f2-8f94-49e445ba5bd9 2000G 176M
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-804f9d90-f642-4e20-b261-0ee8b2b3392b 2000G 284M
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-1dc897cb-be4b-4fa9-8ed5-8ec2561b5b7f 2000G 53248k
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-58191be1-ef94-4860-b27f-5ba5513ba7b0 2000G 12288k
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-41030624-5268-4249-9ca1-41a84bdacd3e 2000G 12288k
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-0458fe2c-6e61-4874-81e1-afb28f80ebbb 2000G 12288k
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-7e4693a4-646b-4ee0-926b-a1a97d417376 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-7cecace0-2f93-48ae-902a-1eb9003c61d6 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-cccec3f9-944c-4da6-97fc-07fcc6ab1eb4 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-d87e3738-9d14-41ee-8b3e-4d1ad7485bbe 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-5f9d5cc4-e8de-4722-a7bc-08a8b9288d65 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-faae70ee-9712-4908-92f9-289ef742f181 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-39e69a9a-3d49-4369-87c5-80ffa7a2515d 2000G 9668M
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-06b27d8c-896a-47fb-87ba-f05a89002c73 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-48d2c155-8f2a-467d-b531-2f1b6c986e0f 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-zero-20180227 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-287fdd44-06d8-4f10-af7f-2ae3d38a0aaa 2000G 244M
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-a3194cc3-ccdb-4740-b627-f22741379172 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-e239e7c0-2c07-435a-957b-ba0e352bc1cb 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-0b9f3a67-7f77-4eb0-9b6c-3c53e3b8f0cc 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-9ee3354b-6482-4670-9075-502881116658 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-c568a0d4-33ca-44cf-ab86-5e1db076d8fc 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-6e7fe680-fd95-4021-807a-b6dd4aa93638 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-6fe77060-3587-41ea-9972-dc88f05fec62 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-282363cd-3652-4a23-bde8-5bebf5df16b7 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-1ac22c13-5945-4d05-bef7-14ca93795739 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-9da8bc53-deec-4687-b1b3-02ccc7b48946 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-0dffe4ad-02a4-4344-876c-af24532f356c 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-6a733d71-7d24-4b23-8142-7d3f6c00d9ee 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-165a93b7-f662-42e9-8946-f44307cfca54 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-a28bd0d2-ae8a-4787-879f-969b623f70e4 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-e8b95dc3-69c5-4ea9-a7fb-5fb7c5608b55 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-33a0e603-6c91-48bf-896e-79b87a45ecc1 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-574e89de-f7d1-4d64-905d-2c239603bbf5 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-ae91ac52-2383-42a5-b1fb-5483d46660b9 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-426f6994-cba1-4ce9-b939-bb427efa5c3c 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54 2000G 0
<TOTAL> 2000G 24840M
Here you can see the volumes/72786dd9-ee5d-4dab-b2c6-63bb58ee2d54 has no changes,but in fact ,we do already make changes in virtual machine ,and write data to the virtual disk.
I use the following steps to detect this problem:
[root@flexhcs_osd_2 /]# rbd info volumes/72786dd9-ee5d-4dab-b2c6-63bb58ee2d54
rbd image '72786dd9-ee5d-4dab-b2c6-63bb58ee2d54':
size 2000 GB in 512000 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.e3bee4366f29eb
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
flags:
parent: volumes/72786dd9-ee5d-4dab-b2c6-63bb58ee2d54_1514318054178@snapshot-7070e07e-4aba-42ec-b4e9-3b32ce3ac8d0
overlap: 2000 GB
1)
[root@flexhcs_osd_2 /]# rados -p volumes ls | grep rbd_data.e3bee4366f29eb > ./1521.txt
[root@flexhcs_osd_2 /]# rados -p volumes ls | grep rbd_data.e3bee4366f29eb > ./1523.txt
do the command before and after my changes on the disk;
2)find the different:
[root@flexhcs_osd_2 /]# diff ./1521.txt ./1523.txt
1500a1501
> rbd_data.e3bee4366f29eb.0000000000025554
2303a2305
> rbd_data.e3bee4366f29eb.0000000000046afe
2581a2584
> rbd_data.e3bee4366f29eb.0000000000046afd
3)find which osd and pg the real data stored in fs:
[root@flexhcs_osd_2 /]# ceph osd map volumes rbd_data.e3bee4366f29eb.0000000000046afd
osdmap e25385 pool 'volumes' (2) object 'rbd_data.e3bee4366f29eb.0000000000046afd' -> pg 2.1f22456e (2.16e) -> up ([2,5], p2) acting ([2,5], p2)
4)cd osd 2 and the pg 2.16,find the "e3bee4366f29eb.0000000000046afd" data:
[root@flexhcs_osd_2 2.16e_head]# pwd
/var/lib/ceph/osd/ceph-2/current/2.16e_head
[root@flexhcs_osd_2 2.16e_head]# find ./ -name *e3bee4366f29eb.0000000000046afd*
./DIR_E/DIR_6/DIR_5/DIR_4/rbd\udata.e3bee4366f29eb.0000000000046afd__head_1F22456E__2
./DIR_E/DIR_6/DIR_5/DIR_4/rbd\udata.e3bee4366f29eb.0000000000046afd__487e_1F22456E__2
As you can see, the "udata.e3bee4366f29eb.0000000000046afd__487e_1F22456E__2" is the snap "COW" data.
But In Fact, the 487e snapshot id has already been deleted!
0x487e = 18558
[root@flexhcs_osd_2 current]# rbd snap ls volumes/72786dd9-ee5d-4dab-b2c6-63bb58ee2d54
SNAPID NAME SIZE
11098 snapshot-ef4c937f-c587-4039-9c66-16819c5351fa 2000 GB
14259 snapshot-b0c7d4af-6fe7-48f2-8f94-49e445ba5bd9 2000 GB
18492 snapshot-804f9d90-f642-4e20-b261-0ee8b2b3392b 2000 GB
18503 snapshot-1dc897cb-be4b-4fa9-8ed5-8ec2561b5b7f 2000 GB
18512 snapshot-58191be1-ef94-4860-b27f-5ba5513ba7b0 2000 GB
18548 snapshot-41030624-5268-4249-9ca1-41a84bdacd3e 2000 GB
18664 snapshot-0458fe2c-6e61-4874-81e1-afb28f80ebbb 2000 GB
18673 snapshot-7e4693a4-646b-4ee0-926b-a1a97d417376 2000 GB
18682 snapshot-7cecace0-2f93-48ae-902a-1eb9003c61d6 2000 GB
18718 snapshot-cccec3f9-944c-4da6-97fc-07fcc6ab1eb4 2000 GB
18729 snapshot-d87e3738-9d14-41ee-8b3e-4d1ad7485bbe 2000 GB
18776 snapshot-5f9d5cc4-e8de-4722-a7bc-08a8b9288d65 2000 GB
18788 snapshot-faae70ee-9712-4908-92f9-289ef742f181 2000 GB
18794 snapshot-39e69a9a-3d49-4369-87c5-80ffa7a2515d 2000 GB
18817 snapshot-06b27d8c-896a-47fb-87ba-f05a89002c73 2000 GB
18826 snapshot-48d2c155-8f2a-467d-b531-2f1b6c986e0f 2000 GB
18832 snapshot-zero-20180227 2000 GB
18836 snapshot-287fdd44-06d8-4f10-af7f-2ae3d38a0aaa 2000 GB
18874 snapshot-a3194cc3-ccdb-4740-b627-f22741379172 2000 GB
18884 snapshot-e239e7c0-2c07-435a-957b-ba0e352bc1cb 2000 GB
18896 snapshot-0b9f3a67-7f77-4eb0-9b6c-3c53e3b8f0cc 2000 GB
18932 snapshot-9ee3354b-6482-4670-9075-502881116658 2000 GB
18944 snapshot-c568a0d4-33ca-44cf-ab86-5e1db076d8fc 2000 GB
18972 snapshot-6e7fe680-fd95-4021-807a-b6dd4aa93638 2000 GB
18983 snapshot-6fe77060-3587-41ea-9972-dc88f05fec62 2000 GB
18992 snapshot-282363cd-3652-4a23-bde8-5bebf5df16b7 2000 GB
19028 snapshot-1ac22c13-5945-4d05-bef7-14ca93795739 2000 GB
19040 snapshot-9da8bc53-deec-4687-b1b3-02ccc7b48946 2000 GB
19050 snapshot-0dffe4ad-02a4-4344-876c-af24532f356c 2000 GB
19090 snapshot-6a733d71-7d24-4b23-8142-7d3f6c00d9ee 2000 GB
19100 snapshot-165a93b7-f662-42e9-8946-f44307cfca54 2000 GB
19128 snapshot-a28bd0d2-ae8a-4787-879f-969b623f70e4 2000 GB
19139 snapshot-e8b95dc3-69c5-4ea9-a7fb-5fb7c5608b55 2000 GB
19145 snapshot-33a0e603-6c91-48bf-896e-79b87a45ecc1 2000 GB
19146 snapshot-574e89de-f7d1-4d64-905d-2c239603bbf5 2000 GB
19150 snapshot-ae91ac52-2383-42a5-b1fb-5483d46660b9 2000 GB
19187 snapshot-426f6994-cba1-4ce9-b939-bb427efa5c3c 2000 GB
As you can see ,the 18558 snapshot id has removed.
So, We think ,maybe something wrong with the volume/72786dd9-ee5d-4dab-b2c6-63bb58ee2d54 , it should use the latest snapshot seq id,not the 0x487e snapshot seq id.
Maybe somebody can give some suggestion, thanks a lot ahead !!
Hi zhengxiang
This might be a question better asked on the upstream mailing list for Ceph; it will get seen by a much wider pool of Ceph users who may have seen your issue before.
I'd also recommend that you upgrade to the latest Jewel point release in Ubuntu (10.2.10) to see if the issue still exists.
Marking 'Incomplete' for now; pending any ML discussion and testing with the latest point release we can either close out this bug, or triage as appropriate.