Crash when removing a snapshot and alive nodes < x

Bug #1389125 reported by sirio81
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
sheepdog
Undecided
Unassigned

Bug Description

Reproduced with
-c 2 and 1 node, cluster manager local
-c 3 and 2 nodes, cluster manager zookeeper

Sheepdog daemon version 0.9.0_1_g9d67dec

How to reproduce:

dog cluster format -c 2
Number of copies (2) is larger than number of nodes (1).
Are you sure you want to continue? [yes/no]: yes
using backend plain store

dog vdi create -P test 1G
dog vdi snapshot test

dd if=/dev/urandom bs=1M count=100 | dog vdi write test
100+0 record dentro
100+0 record fuori
104857600 byte (105 MB) copiati, 10,5425 s, 9,9 MB/s

dog vdi list
  Name Id Size Used Shared Creation time VDI id Copies Tag
s test 1 1.0 GB 1.0 GB 0.0 MB 2014-11-04 09:18 7c2b25 2
  test 0 1.0 GB 104 MB 920 MB 2014-11-04 09:18 7c2b26 2

dog vdi delete -s 1 test^C

dog vdi delete -s 1 test
failed to read a response
Failed to write object 807c2b2500000000
failed to update inode for discarding objects: 807c2b2500000000

sheep.log
Nov 04 09:18:03 INFO [main] md_add_disk(343) /mnt/sheep/0/obj, vdisk nr 50, total disk 1
Nov 04 09:18:03 INFO [main] send_join_request(1006) IPv4 ip:127.0.0.1 port:7000 going to join the cluster
Nov 04 09:18:03 NOTICE [main] nfs_init(607) nfs server service is not compiled
Nov 04 09:18:03 WARN [main] check_host_env(497) Allowed open files 1024 too small, suggested 6144000
Nov 04 09:18:03 INFO [main] main(951) sheepdog daemon (version 0.9.0_1_g9d67dec) started
Nov 04 09:18:06 INFO [main] rx_main(830) req=0x21a74d0, fd=19, client=127.0.0.1:59692, op=MAKE_FS, data=(not string)
Nov 04 09:18:06 INFO [main] tx_main(882) req=0x21a74d0, fd=19, client=127.0.0.1:59692, op=MAKE_FS, result=00
Nov 04 09:18:31 INFO [main] rx_main(830) req=0x21a74d0, fd=19, client=127.0.0.1:59700, op=MAKE_FS, data=(not string)
Nov 04 09:18:31 INFO [main] tx_main(882) req=0x21a74d0, fd=19, client=127.0.0.1:59700, op=MAKE_FS, result=00
Nov 04 09:18:38 INFO [main] rx_main(830) req=0x21a74d0, fd=15, client=127.0.0.1:59703, op=NEW_VDI, data=(not string)
Nov 04 09:18:38 INFO [main] tx_main(882) req=0x21a74d0, fd=15, client=127.0.0.1:59703, op=NEW_VDI, result=00
Nov 04 09:18:45 INFO [main] rx_main(830) req=0x21ac470, fd=15, client=127.0.0.1:59707, op=NEW_VDI, data=(not string)
Nov 04 09:18:46 INFO [main] tx_main(882) req=0x21ac470, fd=15, client=127.0.0.1:59707, op=NEW_VDI, result=00
Nov 04 09:20:08 EMERG [io 12419] oid_to_vnodes(80) PANIC: can't find a valid vnode
Nov 04 09:20:08 EMERG [io 12419] crash_handler(268) sheep exits unexpectedly (Aborted).
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(833) sheep.c:270: crash_handler
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(847) /lib/x86_64-linux-gnu/libpthread.so.0(+0xf09f) [0x7f17751ed09f]
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x34) [0x7f17747e1164]
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(abort+0x17f) [0x7f17747e43df]
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(833) sheep.h:80: oid_to_vnodes
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(833) ops.c:1923: do_process_work
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(833) work.c:340: worker_routine
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(847) /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b4f) [0x7f17751e4b4f]
Nov 04 09:20:08 EMERG [io 12419] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(clone+0x6c) [0x7f177488b7bc]

Revision history for this message
sirio81 (sirio81) wrote :

The same happens using erasure code:
-c 2:1 and 2 nodes

Nov 04 09:53:11 EMERG [io 8186] oid_to_vnodes(80) PANIC: can't find a valid vnode
Nov 04 09:53:11 EMERG [io 8186] crash_handler(268) sheep exits unexpectedly (Aborted).
Nov 04 09:53:11 EMERG [io 8186] sd_backtrace(833) sheep.c:270: crash_handler
Nov 04 09:53:11 EMERG [io 8186] sd_backtrace(847) /lib/x86_64-linux-gnu/libpthread.so.0(+0xf02f) [0x7fbdab64402f]
Nov 04 09:53:11 EMERG [io 8186] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x34) [0x7fbdaac39474]
Nov 04 09:53:11 EMERG [io 8186] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(abort+0x17f) [0x7fbdaac3c6ef]
Nov 04 09:53:11 EMERG [io 8186] sd_backtrace(833) sheep.h:80: oid_to_vnodes
Nov 04 09:53:12 EMERG [io 8186] sd_backtrace(833) ops.c:1923: do_process_work
Nov 04 09:53:12 EMERG [io 8186] sd_backtrace(833) work.c:340: worker_routine
Nov 04 09:53:12 ERROR [gway 8213] wait_forward_request(416) remote node might have gone away
Nov 04 09:53:12 EMERG [io 8186] sd_backtrace(847) /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b4f) [0x7fbdab63bb4f]
Nov 04 09:53:12 EMERG [io 8186] sd_backtrace(847) /lib/x86_64-linux-gnu/libc.so.6(clone+0x6c) [0x7fbdaace313c]

Revision history for this message
sirio81 (sirio81) wrote :

This bug doesn't affect 0.8.3. I guess its specific of the new gc algorithm.

Changed in sheepdog-project:
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers