sheep crashes after OS crash with 0 sized files in the cache
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
sheepdog |
New
|
Undecided
|
Unassigned |
Bug Description
after an OS crash some of the files in the cache were truncated. sheep was crashing when these were being accessed and the log messages were less than informative.
I'd like to know
* how serious is this?
* what was likely to be lost (i.e. are these files referring to blocks that were being written and not yet confirmed to the qemu/guest OS as written or were some blocks that were supposed to be on disk lost?
* does the cache code use O_DIRECT and could there be a problem related to the meta-data not being written while the data was?
* can you add code to check for this and repair if possible
* can you add a utility to check for consistency (something like fsck)
* would using a journal prevent this?
In my specific case I doubt I would have lost any actual data, the last few hundred megabytes of modified data was copied and all seems ok, but I worry that this may happen again with more damage.
Versions:
Sheepdog daemon version 0.7.5
Linux compucom 3.16.0-53-generic #72~14.04.1-Ubuntu SMP Fri Nov 6 18:17:23 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
QEMU emulator version 2.0.0 (Debian 2.0.0+dfsg-
sheep started as:
/usr/sbin/sheep -c local /mnt/sheep/0/ -z 0 -p 7000 -r 127.0.0.1:4000 -w size=4096 dir=/mnt/
the last part of the log:
Nov 12 12:32:26 INFO [main] md_add_disk(141) /mnt/sheep/0/obj, nr 1
Nov 12 12:32:27 INFO [main] send_join_
Nov 12 12:32:27 ERROR [main] for_each_
Nov 12 12:32:28 NOTICE [main] http_init(451) http service is not complied
Nov 12 12:32:28 ERROR [main] check_host_env(461) WARN: Allowed open files 1024 too small, suggested 1024000
Nov 12 12:32:28 INFO [main] main(853) sheepdog daemon (version 0.7.5) started
Nov 12 12:32:28 INFO [main] recover_
Nov 12 12:32:28 INFO [main] recover_
Nov 12 12:32:28 INFO [main] recover_
...... more of the same .....
Nov 12 12:32:29 INFO [main] recover_
Nov 12 12:32:29 INFO [main] recover_
Nov 12 12:32:29 INFO [main] recover_
Nov 12 12:33:02 ERROR [oc_push 2661] read_cache_
Nov 12 12:33:02 EMERG [oc_push 2661] do_push_object(901) PANIC: push failed but should never fail
Nov 12 12:33:02 EMERG [oc_push 2661] crash_handler(250) sheep exits unexpectedly (Aborted).
Nov 12 12:33:02 EMERG [oc_push 2661] sd_backtrace(857) /usr/sbin/sheep() [0x405bd7]
Nov 12 12:33:02 EMERG [oc_push 2661] sd_backtrace(857) /lib/x86_
Nov 12 12:33:02 EMERG [oc_push 2661] sd_backtrace(857) /lib/x86_
Nov 12 12:33:02 EMERG [oc_push 2661] sd_backtrace(857) /lib/x86_
Nov 12 12:33:02 EMERG [oc_push 2661] sd_backtrace(857) /usr/sbin/sheep() [0x415c5e]
Nov 12 12:33:02 EMERG [oc_push 2661] sd_backtrace(857) /usr/sbin/sheep() [0x428f49]
Nov 12 12:33:02 EMERG [oc_push 2661] sd_backtrace(857) /lib/x86_
Nov 12 12:33:02 EMERG [oc_push 2661] sd_backtrace(857) /lib/x86_
Nov 12 12:33:04 INFO [oc_push 2661] dump_stack_
Nov 12 12:33:04 INFO [oc_push 2661] dump_stack_
Nov 12 12:33:04 EMERG [oc_push 2661] __sd_dump_
Nov 12 12:33:07 ERROR [main] crash_handler(490) sheep pid 2648 exited unexpectedly.
Thanks for your report, mhex.
We are not using launchpad for bug tracking. We are using github issue: https:/ /github. com/sheepdog/ sheepdog/ issues
Could you report bugs to github?
In addition, object cache is really unstable feature. Please do not use it.