snapshot size 0 and image size 0

Bug #1752511 reported by zhengxiang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ceph (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Hi,everyone:

ceph - jewel 10.2.6

We use ceph as openstack storage backend, and use ceph snapshot.But recently we met a problem about snapshot usage.

[root@flexhcs_osd_2 /]# rbd du volumes/72786dd9-ee5d-4dab-b2c6-63bb58ee2d54
NAME PROVISIONED USED
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-ef4c937f-c587-4039-9c66-16819c5351fa 2000G 14380M
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-b0c7d4af-6fe7-48f2-8f94-49e445ba5bd9 2000G 176M
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-804f9d90-f642-4e20-b261-0ee8b2b3392b 2000G 284M
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-1dc897cb-be4b-4fa9-8ed5-8ec2561b5b7f 2000G 53248k
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-58191be1-ef94-4860-b27f-5ba5513ba7b0 2000G 12288k
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-41030624-5268-4249-9ca1-41a84bdacd3e 2000G 12288k
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-0458fe2c-6e61-4874-81e1-afb28f80ebbb 2000G 12288k
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-7e4693a4-646b-4ee0-926b-a1a97d417376 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-7cecace0-2f93-48ae-902a-1eb9003c61d6 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-cccec3f9-944c-4da6-97fc-07fcc6ab1eb4 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-d87e3738-9d14-41ee-8b3e-4d1ad7485bbe 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-5f9d5cc4-e8de-4722-a7bc-08a8b9288d65 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-faae70ee-9712-4908-92f9-289ef742f181 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-39e69a9a-3d49-4369-87c5-80ffa7a2515d 2000G 9668M
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-06b27d8c-896a-47fb-87ba-f05a89002c73 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-48d2c155-8f2a-467d-b531-2f1b6c986e0f 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-zero-20180227 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-287fdd44-06d8-4f10-af7f-2ae3d38a0aaa 2000G 244M
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-a3194cc3-ccdb-4740-b627-f22741379172 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-e239e7c0-2c07-435a-957b-ba0e352bc1cb 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-0b9f3a67-7f77-4eb0-9b6c-3c53e3b8f0cc 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-9ee3354b-6482-4670-9075-502881116658 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-c568a0d4-33ca-44cf-ab86-5e1db076d8fc 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-6e7fe680-fd95-4021-807a-b6dd4aa93638 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-6fe77060-3587-41ea-9972-dc88f05fec62 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-282363cd-3652-4a23-bde8-5bebf5df16b7 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-1ac22c13-5945-4d05-bef7-14ca93795739 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-9da8bc53-deec-4687-b1b3-02ccc7b48946 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-0dffe4ad-02a4-4344-876c-af24532f356c 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-6a733d71-7d24-4b23-8142-7d3f6c00d9ee 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-165a93b7-f662-42e9-8946-f44307cfca54 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-a28bd0d2-ae8a-4787-879f-969b623f70e4 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-e8b95dc3-69c5-4ea9-a7fb-5fb7c5608b55 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-33a0e603-6c91-48bf-896e-79b87a45ecc1 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-574e89de-f7d1-4d64-905d-2c239603bbf5 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-ae91ac52-2383-42a5-b1fb-5483d46660b9 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54@snapshot-426f6994-cba1-4ce9-b939-bb427efa5c3c 2000G 0
72786dd9-ee5d-4dab-b2c6-63bb58ee2d54 2000G 0
<TOTAL> 2000G 24840M

Here you can see the volumes/72786dd9-ee5d-4dab-b2c6-63bb58ee2d54 has no changes,but in fact ,we do already make changes in virtual machine ,and write data to the virtual disk.

I use the following steps to detect this problem:

[root@flexhcs_osd_2 /]# rbd info volumes/72786dd9-ee5d-4dab-b2c6-63bb58ee2d54
rbd image '72786dd9-ee5d-4dab-b2c6-63bb58ee2d54':
 size 2000 GB in 512000 objects
 order 22 (4096 kB objects)
 block_name_prefix: rbd_data.e3bee4366f29eb
 format: 2
 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
 flags:
 parent: volumes/72786dd9-ee5d-4dab-b2c6-63bb58ee2d54_1514318054178@snapshot-7070e07e-4aba-42ec-b4e9-3b32ce3ac8d0
 overlap: 2000 GB

1)
[root@flexhcs_osd_2 /]# rados -p volumes ls | grep rbd_data.e3bee4366f29eb > ./1521.txt
[root@flexhcs_osd_2 /]# rados -p volumes ls | grep rbd_data.e3bee4366f29eb > ./1523.txt
do the command before and after my changes on the disk;

2)find the different:
[root@flexhcs_osd_2 /]# diff ./1521.txt ./1523.txt
1500a1501
> rbd_data.e3bee4366f29eb.0000000000025554
2303a2305
> rbd_data.e3bee4366f29eb.0000000000046afe
2581a2584
> rbd_data.e3bee4366f29eb.0000000000046afd

3)find which osd and pg the real data stored in fs:
[root@flexhcs_osd_2 /]# ceph osd map volumes rbd_data.e3bee4366f29eb.0000000000046afd
osdmap e25385 pool 'volumes' (2) object 'rbd_data.e3bee4366f29eb.0000000000046afd' -> pg 2.1f22456e (2.16e) -> up ([2,5], p2) acting ([2,5], p2)

4)cd osd 2 and the pg 2.16,find the "e3bee4366f29eb.0000000000046afd" data:
[root@flexhcs_osd_2 2.16e_head]# pwd
/var/lib/ceph/osd/ceph-2/current/2.16e_head
[root@flexhcs_osd_2 2.16e_head]# find ./ -name *e3bee4366f29eb.0000000000046afd*
./DIR_E/DIR_6/DIR_5/DIR_4/rbd\udata.e3bee4366f29eb.0000000000046afd__head_1F22456E__2
./DIR_E/DIR_6/DIR_5/DIR_4/rbd\udata.e3bee4366f29eb.0000000000046afd__487e_1F22456E__2

As you can see, the "udata.e3bee4366f29eb.0000000000046afd__487e_1F22456E__2" is the snap "COW" data.

But In Fact, the 487e snapshot id has already been deleted!
0x487e = 18558

[root@flexhcs_osd_2 current]# rbd snap ls volumes/72786dd9-ee5d-4dab-b2c6-63bb58ee2d54
SNAPID NAME SIZE
 11098 snapshot-ef4c937f-c587-4039-9c66-16819c5351fa 2000 GB
 14259 snapshot-b0c7d4af-6fe7-48f2-8f94-49e445ba5bd9 2000 GB
 18492 snapshot-804f9d90-f642-4e20-b261-0ee8b2b3392b 2000 GB
 18503 snapshot-1dc897cb-be4b-4fa9-8ed5-8ec2561b5b7f 2000 GB
 18512 snapshot-58191be1-ef94-4860-b27f-5ba5513ba7b0 2000 GB
 18548 snapshot-41030624-5268-4249-9ca1-41a84bdacd3e 2000 GB
 18664 snapshot-0458fe2c-6e61-4874-81e1-afb28f80ebbb 2000 GB
 18673 snapshot-7e4693a4-646b-4ee0-926b-a1a97d417376 2000 GB
 18682 snapshot-7cecace0-2f93-48ae-902a-1eb9003c61d6 2000 GB
 18718 snapshot-cccec3f9-944c-4da6-97fc-07fcc6ab1eb4 2000 GB
 18729 snapshot-d87e3738-9d14-41ee-8b3e-4d1ad7485bbe 2000 GB
 18776 snapshot-5f9d5cc4-e8de-4722-a7bc-08a8b9288d65 2000 GB
 18788 snapshot-faae70ee-9712-4908-92f9-289ef742f181 2000 GB
 18794 snapshot-39e69a9a-3d49-4369-87c5-80ffa7a2515d 2000 GB
 18817 snapshot-06b27d8c-896a-47fb-87ba-f05a89002c73 2000 GB
 18826 snapshot-48d2c155-8f2a-467d-b531-2f1b6c986e0f 2000 GB
 18832 snapshot-zero-20180227 2000 GB
 18836 snapshot-287fdd44-06d8-4f10-af7f-2ae3d38a0aaa 2000 GB
 18874 snapshot-a3194cc3-ccdb-4740-b627-f22741379172 2000 GB
 18884 snapshot-e239e7c0-2c07-435a-957b-ba0e352bc1cb 2000 GB
 18896 snapshot-0b9f3a67-7f77-4eb0-9b6c-3c53e3b8f0cc 2000 GB
 18932 snapshot-9ee3354b-6482-4670-9075-502881116658 2000 GB
 18944 snapshot-c568a0d4-33ca-44cf-ab86-5e1db076d8fc 2000 GB
 18972 snapshot-6e7fe680-fd95-4021-807a-b6dd4aa93638 2000 GB
 18983 snapshot-6fe77060-3587-41ea-9972-dc88f05fec62 2000 GB
 18992 snapshot-282363cd-3652-4a23-bde8-5bebf5df16b7 2000 GB
 19028 snapshot-1ac22c13-5945-4d05-bef7-14ca93795739 2000 GB
 19040 snapshot-9da8bc53-deec-4687-b1b3-02ccc7b48946 2000 GB
 19050 snapshot-0dffe4ad-02a4-4344-876c-af24532f356c 2000 GB
 19090 snapshot-6a733d71-7d24-4b23-8142-7d3f6c00d9ee 2000 GB
 19100 snapshot-165a93b7-f662-42e9-8946-f44307cfca54 2000 GB
 19128 snapshot-a28bd0d2-ae8a-4787-879f-969b623f70e4 2000 GB
 19139 snapshot-e8b95dc3-69c5-4ea9-a7fb-5fb7c5608b55 2000 GB
 19145 snapshot-33a0e603-6c91-48bf-896e-79b87a45ecc1 2000 GB
 19146 snapshot-574e89de-f7d1-4d64-905d-2c239603bbf5 2000 GB
 19150 snapshot-ae91ac52-2383-42a5-b1fb-5483d46660b9 2000 GB
 19187 snapshot-426f6994-cba1-4ce9-b939-bb427efa5c3c 2000 GB

As you can see ,the 18558 snapshot id has removed.

So, We think ,maybe something wrong with the volume/72786dd9-ee5d-4dab-b2c6-63bb58ee2d54 , it should use the latest snapshot seq id,not the 0x487e snapshot seq id.

Maybe somebody can give some suggestion, thanks a lot ahead !!

description: updated
Revision history for this message
James Page (james-page) wrote :

Hi zhengxiang

This might be a question better asked on the upstream mailing list for Ceph; it will get seen by a much wider pool of Ceph users who may have seen your issue before.

I'd also recommend that you upgrade to the latest Jewel point release in Ubuntu (10.2.10) to see if the issue still exists.

Marking 'Incomplete' for now; pending any ML discussion and testing with the latest point release we can either close out this bug, or triage as appropriate.

Changed in ceph (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for ceph (Ubuntu) because there has been no activity for 60 days.]

Changed in ceph (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.