Data written from Qemu disappears, NO_OBJECT_FOUND occured.

Bug #1456421 reported by Saeki Masaki on 2015-05-19
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
sheepdog
Critical
Unassigned

Bug Description

High load writing it went from Qemu.
Data object file has been deleted unintentionally.
Occurred only if vdi not pre-allocated.

If this is the problem, because data corruption occurs in the IO request from Qemu.

It shows excerpt log of when the problem occurred.

May 14 17:08:42 DEBUG [main] queue_request(486) CREATE_AND_WRITE_PEER, 1
May 14 17:08:42 DEBUG [io 22310] do_process_work(1923) a3, 6de4c40000001c, 1
May 14 17:08:42 DEBUG [main] queue_request(486) CREATE_AND_WRITE_PEER, 1
May 14 17:08:42 DEBUG [io 22311] do_process_work(1923) a3, 6de4c400000019, 1
May 14 17:08:42 DEBUG [main] queue_request(486) WRITE_OBJ, 1
May 14 17:08:42 DEBUG [gway 22972] do_process_work(1923) 3, 806de4c400000000, 1
May 14 17:08:42 DEBUG [main] queue_request(486) READ_OBJ, 1
May 14 17:08:42 DEBUG [gway 22986] do_process_work(1923) 2, 806de4c400000000, 1
May 14 17:08:42 DEBUG [main] queue_request(486) WRITE_OBJ, 1
May 14 17:08:42 DEBUG [gway 22987] do_process_work(1923) 3, 806de4c400000000, 1
May 14 17:08:42 DEBUG [main] queue_request(486) READ_OBJ, 1
May 14 17:08:42 DEBUG [gway 22978] do_process_work(1923) 2, 806de4c400000000, 1
May 14 17:08:42 DEBUG [main] queue_request(486) READ_OBJ, 1
May 14 17:08:42 DEBUG [gway 22977] do_process_work(1923) 2, 806de4c400000000, 1
May 14 17:08:42 DEBUG [main] queue_request(486) WRITE_OBJ, 1
May 14 17:08:42 DEBUG [gway 22970] do_process_work(1923) 3, 806de4c400000000, 1
May 14 17:08:42 DEBUG [main] gateway_op_done(99) a number of ongoing inode update request: 1
May 14 17:08:42 DEBUG [main] queue_request(486) READ_OBJ, 1
May 14 17:08:42 DEBUG [gway 22982] do_process_work(1923) 2, 806de4c400000000, 1
May 14 17:08:42 DEBUG [main] queue_request(486) WRITE_OBJ, 1
May 14 17:08:42 DEBUG [gway 22274] do_process_work(1923) 3, 6de4c400000020, 1
May 14 17:08:42 DEBUG [gway 22972] sd_dec_object_refcnt(495) 6de4c40000001e, 0, 0
May 14 17:08:42 DEBUG [main] queue_request(486) REMOVE_OBJ, 1
May 14 17:08:42 DEBUG [gway 22973] do_process_work(1923) 4, 6de4c40000001e, 1

look at the sheep.log, REMOVE_OBJ is sent after writing the object.

analysis of update data_vdi_id requests.
 new_vid - updated data_vdi_id value sent to the sheep from qemu.
 old_vid - previous data_vdi_id value read from inode object in sheep.

 idx | data_vdi_id req:1 | data_vdi_id req:2
 ----+-------------------+-------------------
     | old_vid | new_vid | old_vid | new_vid
 ... | | | |
 11 | 0 | 6de4c4 | |
 12 | 0 | 6de4c4 | 6de4c4 | 6de4c4
 13 | 0 | 0 | 0 | 0
 14 | 0 | 6de4c4 | 6de4c4 | 6de4c4
 15 | 0 | 6de4c4 | 6de4c4 | 6de4c4
 16 | 0 | 6de4c4 | 6de4c4 | 6de4c4
 17 | 0 | 0 | 0 | 0
 18 | 0 | 6de4c4 | 6de4c4 | 6de4c4
 19 | 0 | 6de4c4 | 6de4c4 | 6de4c4
 1a | 0 | 6de4c4 | 6de4c4 | 6de4c4
 1b | 0 | 0 | 0 | 0
 1c | 0 | 6de4c4 | 6de4c4 | 6de4c4
 1d | 0 | 6de4c4 | 6de4c4 | 6de4c4
 1e | 0 | 6de4c4 | 6de4c4 | 0 <= back to zero at req:2
 1f | | | 0 | 0
 20 | | | 0 | 6de4c4
 ...

In the case to be removed, it is further updated by 0 after updating data_vdi_id the idx in vid.
so the object is judged unnecessary in sd_dec_object_refcnt().

It has added a temporary debug log as 'ADD'.

May 14 17:08:42 DEBUG [gway 22984] do_process_work(1923) 1, 6de4c400000011, 1
May 14 17:08:42 DEBUG [gway 22970] do_process_work(1923) 1, 6de4c400000012, 1
May 14 17:08:42 DEBUG [io 22310] do_process_work(1923) a3, 6de4c400000011, 1
May 14 17:08:42 DEBUG [io 22311] do_process_work(1923) a3, 6de4c400000012, 1
May 14 17:08:42 DEBUG [gway 22969] do_process_work(1923) 1, 6de4c400000013, 1
May 14 17:08:42 DEBUG [gway 22987] do_process_work(1923) 1, 6de4c400000014, 1
May 14 17:08:42 DEBUG [gway 22971] do_process_work(1923) 1, 6de4c400000015, 1
May 14 17:08:42 DEBUG [gway 22274] do_process_work(1923) 1, 6de4c400000018, 1
May 14 17:08:42 DEBUG [io 22267] do_process_work(1923) a3, 6de4c400000018, 1
May 14 17:08:42 DEBUG [gway 22631] do_process_work(1923) 1, 6de4c400000019, 1
May 14 17:08:42 DEBUG [gway 22978] do_process_work(1923) 1, 6de4c40000001a, 1
May 14 17:08:42 DEBUG [gway 22977] do_process_work(1923) 1, 6de4c400000016, 1
May 14 17:08:42 DEBUG [gway 22981] do_process_work(1923) 1, 6de4c400000017, 1
May 14 17:08:42 DEBUG [gway 22284] do_process_work(1923) 1, 6de4c40000001b, 1
May 14 17:08:42 DEBUG [gway 22976] do_process_work(1923) 1, 6de4c40000001c, 1
May 14 17:08:42 DEBUG [io 22310] do_process_work(1923) a3, 6de4c40000001b, 1
May 14 17:08:42 DEBUG [gway 22975] do_process_work(1923) 3, 6de4c400000014, 1
May 14 17:08:42 DEBUG [gway 22982] do_process_work(1923) 1, 6de4c40000001d, 1
May 14 17:08:42 DEBUG [gway 22974] do_process_work(1923) 1, 6de4c40000001e, 1
May 14 17:08:42 DEBUG [gway 22980] do_process_work(1923) 1, 6de4c40000001f, 1
May 14 17:08:42 DEBUG [gway 22979] do_process_work(1923) 1, 6de4c400000020, 1
May 14 17:08:42 DEBUG [gway 22988] do_process_work(1923) 3, 6de4c40000001a, 1
May 14 17:08:42 DEBUG [gway 22985] do_process_work(1923) 3, 6de4c400000016, 1
May 14 17:08:42 DEBUG [gway 22973] do_process_work(1923) 3, 6de4c400000012, 1
May 14 17:08:42 DEBUG [io 22311] do_process_work(1923) a5, 6de4c400000012, 1
May 14 17:08:42 DEBUG [gway 22968] do_process_work(1923) 3, 6de4c40000001d, 1
May 14 17:08:42 DEBUG [gway 22983] do_process_work(1923) 3, 6de4c400000018, 1
May 14 17:08:42 DEBUG [io 22267] do_process_work(1923) a5, 6de4c400000018, 1
May 14 17:08:42 DEBUG [io 22310] do_process_work(1923) a3, 6de4c40000001c, 1
May 14 17:08:42 DEBUG [io 22311] do_process_work(1923) a3, 6de4c400000019, 1
May 14 17:08:42 DEBUG [gway 22972] do_process_work(1923) 3, 806de4c400000000, 1
May 14 17:08:42 DEBUG [gway 22986] do_process_work(1923) 2, 806de4c400000000, 1
May 14 17:08:42 DEBUG [gway 22987] do_process_work(1923) 3, 806de4c400000000, 1
May 14 17:08:42 DEBUG [gway 22978] do_process_work(1923) 2, 806de4c400000000, 1
May 14 17:08:42 DEBUG [gway 22977] do_process_work(1923) 2, 806de4c400000000, 1
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(595) ADD start idx:11
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:0 old_vid:0 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:1 old_vid:0 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:2 old_vid:0 new_vid:0 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:3 old_vid:0 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:4 old_vid:0 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:5 old_vid:0 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:6 old_vid:0 new_vid:0 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:7 old_vid:0 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:8 old_vid:0 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:9 old_vid:0 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:10 old_vid:0 new_vid:0 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:11 old_vid:0 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:12 old_vid:0 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22987] update_obj_refcnt(598) ADD i:13 old_vid:0 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22970] do_process_work(1923) 3, 806de4c400000000, 1
May 14 17:08:42 DEBUG [gway 22982] do_process_work(1923) 2, 806de4c400000000, 1
May 14 17:08:42 DEBUG [gway 22274] do_process_work(1923) 3, 6de4c400000020, 1
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(595) ADD start idx:12
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:0 old_vid:6de4c4 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:1 old_vid:0 new_vid:0 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:2 old_vid:6de4c4 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:3 old_vid:6de4c4 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:4 old_vid:6de4c4 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:5 old_vid:0 new_vid:0 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:6 old_vid:6de4c4 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:7 old_vid:6de4c4 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:8 old_vid:6de4c4 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:9 old_vid:0 new_vid:0 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:10 old_vid:6de4c4 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:11 old_vid:6de4c4 new_vid:6de4c4 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:12 old_vid:6de4c4 new_vid:0 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22973] do_process_work(1923) 4, 6de4c40000001e, 1
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:13 vids:0 new_vids:0 refs.generation:0 refs.count:0
May 14 17:08:42 DEBUG [gway 22972] update_obj_refcnt(598) ADD i:14 vids:0 new_vids:6de4c4 refs.generation:0 refs.count:0

Changed in sheepdog-project:
status: New → In Progress
importance: Undecided → Critical
Vasiliy Tolstov (vase) wrote :

I think we can update this bug, because patch sended to qemu list.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers