sheepdog

Crash when removing a snapshot and alive nodes < x

Bug #1389125 reported by sirio81 on 2014-11-04

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	sheepdog	Fix Committed	Undecided	Unassigned

Bug Description

Reproduced with
-c 2 and 1 node, cluster manager local
-c 3 and 2 nodes, cluster manager zookeeper

Sheepdog daemon version 0.9.0_1_g9d67dec

How to reproduce:

dog cluster format -c 2
Number of copies (2) is larger than number of nodes (1).
Are you sure you want to continue? [yes/no]: yes
using backend plain store

dog vdi create -P test 1G
dog vdi snapshot test

dd if=/dev/urandom bs=1M count=100 | dog vdi write test
100+0 record dentro
100+0 record fuori
104857600 byte (105 MB) copiati, 10,5425 s, 9,9 MB/s

dog vdi list
Name Id Size Used Shared Creation time VDI id Copies Tag
s test 1 1.0 GB 1.0 GB 0.0 MB 2014-11-04 09:18 7c2b25 2
test 0 1.0 GB 104 MB 920 MB 2014-11-04 09:18 7c2b26 2

dog vdi delete -s 1 test^C

dog vdi delete -s 1 test
failed to read a response
Failed to write object 807c2b2500000000
failed to update inode for discarding objects: 807c2b2500000000

sheep.log
Nov 04 09:18:03 INFO [main] md_add_disk(343) /mnt/sheep/0/obj, vdisk nr 50, total disk 1
Nov 04 09:18:03 INFO [main] send_join_request(1006) IPv4 ip:127.0.0.1 port:7000 going to join the cluster
Nov 04 09:18:03 NOTICE [main] nfs_init(607) nfs server service is not compiled
Nov 04 09:18:03 WARN [main] check_host_env(497) Allowed open files 1024 too small, suggested 6144000
Nov 04 09:18:03 INFO [main] main(951) sheepdog daemon (version 0.9.0_1_g9d67dec) started
Nov 04 09:18:06 INFO [main] rx_main(830) req=0x21a74d0, fd=19, client=127.0.0.1:59692, op=MAKE_FS, data=(not string)
Nov 04 09:18:06 INFO [main] tx_main(882) req=0x21a74d0, fd=19, client=127.0.0.1:59692, op=MAKE_FS, result=00
Nov 04 09:18:31 INFO [main] rx_main(830) req=0x21a74d0, fd=19, client=127.0.0.1:59700, op=MAKE_FS, data=(not string)
Nov 04 09:18:31 INFO [main] tx_main(882) req=0x21a74d0, fd=19, client=127.0.0.1:59700, op=MAKE_FS, result=00
Nov 04 09:18:38 INFO [main] rx_main(830) req=0x21a74d0, fd=15, client=127.0.0.1:59703, op=NEW_VDI, data=(not string)
Nov 04 09:18:38 INFO [main] tx_main(882) req=0x21a74d0, fd=15, client=127.0.0.1:59703, op=NEW_VDI, result=00
Nov 04 09:18:45 INFO [main] rx_main(830) req=0x21ac470, fd=15, client=127.0.0.1:59707, op=NEW_VDI, data=(not string)
Nov 04 09:18:46 INFO [main] tx_main(882) req=0x21ac470, fd=15, client=127.0.0.1:59707, op=NEW_VDI, result=00
Nov 04 09:20:08 EMERG [io 12419] oid_to_vnodes(80) PANIC: can't find a valid vnode
Nov 04 09:20:08 EMERG [io 12419] crash_handler(268) sheep exits unexpectedly (Aborted).
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(833) sheep.c:270: crash_handler
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(847) /lib/x86_64-linux-gnu/libpthread.so.0(+0xf09f) [0x7f17751ed09f]
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x34) [0x7f17747e1164]
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(abort+0x17f) [0x7f17747e43df]
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(833) sheep.h:80: oid_to_vnodes
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(833) ops.c:1923: do_process_work
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(833) work.c:340: worker_routine
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(847) /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b4f) [0x7f17751e4b4f]
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(clone+0x6c) [0x7f177488b7bc]

Revision history for this message

sirio81 (sirio81) wrote on 2014-11-04:

The same happens using erasure code:
-c 2:1 and 2 nodes

Nov 04 09:53:11 EMERG [io 8186] oid_to_vnodes(80) PANIC: can't find a valid vnode
Nov 04 09:53:11 EMERG [io 8186] crash_handler(268) sheep exits unexpectedly (Aborted).
Nov 04 09:53:11 EMERG [io 8186] sd_backtrace(833) sheep.c:270: crash_handler
Nov 04 09:53:11 EMERG [io 8186] sd_backtrace(847) /lib/x86_64-linux-gnu/libpthread.so.0(+0xf02f) [0x7fbdab64402f]
Nov 04 09:53:11 EMERG [io 8186] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x34) [0x7fbdaac39474]
Nov 04 09:53:11 EMERG [io 8186] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(abort+0x17f) [0x7fbdaac3c6ef]
Nov 04 09:53:11 EMERG [io 8186] sd_backtrace(833) sheep.h:80: oid_to_vnodes
Nov 04 09:53:12 EMERG [io 8186] sd_backtrace(833) ops.c:1923: do_process_work
Nov 04 09:53:12 EMERG [io 8186] sd_backtrace(833) work.c:340: worker_routine
Nov 04 09:53:12 ERROR [gway 8213] wait_forward_request(416) remote node might have gone away
Nov 04 09:53:12 EMERG [io 8186] sd_backtrace(847) /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b4f) [0x7fbdab63bb4f]
Nov 04 09:53:12 EMERG [io 8186] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(clone+0x6c) [0x7fbdaace313c]

Revision history for this message

sirio81 (sirio81) wrote on 2014-11-04:

This bug doesn't affect 0.8.3. I guess its specific of the new gc algorithm.

Hitoshi Mitake (mitake-hitoshi) on 2015-07-01

Changed in sheepdog-project:
status:	New → Fix Committed

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.