Core dump on multipathd shutdown - trusty 14.04.4
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
multipath-tools (Ubuntu) |
Invalid
|
Medium
|
Dragan S. | ||
Trusty |
Fix Released
|
Medium
|
Dragan S. |
Bug Description
[Impact]
* During "service multipath-tools stop" multipath daemon
is trying to cleanup and shut down several concurrent
threads. At times depending on a race condition between
two threads, one thread might free resources that are still
used by another thread.
This is causing the multipathd to dump crash core on
stop events.
* Fix should be backported to trusty to avoid more support
issues being filed.
* This change delays freeing resources that another thread is
still using.
[Test Case]
* install multipath-tools, create a basic multipath.conf with
devices under management. Run: "service multipath-tools start"
run I/O on devices and keep the system CPU busy, then run
"service multipath-tools stop".
[Regression Potential]
* There should be no regression potential with this change,
this problem happens on the exit path and we are only delaying
a free call.
[Original Description]
On ubuntu trusty 14.04.4 in multipath-tools version 0.4.9-3ubuntu7.14 there is bug in multipathd on shutdown.
The code will access pathvec pointer which is a valid address:
Reading symbols from /sbin/multipath
done.
[New LWP 41631]
[New LWP 41584]
[New LWP 41633]
[New LWP 41632]
[New LWP 41582]
[New LWP 41583]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_
Core was generated by `/sbin/multipathd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000004075db in checkerloop (ap=0x1b81040) at main.c:1150
1150 vector_foreach_slot (vecs->pathvec, pp, i) {
(gdb) list
1145 pthread_
1146 lock(vecs->lock);
1147 condlog(4, "tick");
1148
1149 if (vecs->pathvec) {
1150 vector_foreach_slot (vecs->pathvec, pp, i) {
1151 check_path(vecs, pp);
1152 }
1153 }
1154 if (vecs->mpvec) {
Pathvec is a valid pointer:
(gdb) p vecs->pathvec
$1 = (vector) 0x1b81280
But the contents of the structure are just garbage:
(gdb) p *vecs->pathvec
$2 = {allocated = 1651076143, slot = 0x756e696c2d34365f}
(gdb)
Changed in multipath-tools (Ubuntu): | |
assignee: | nobody → Dragan S. (dragan-s) |
status: | New → In Progress |
Changed in multipath-tools (Ubuntu Trusty): | |
status: | New → Incomplete |
status: | Incomplete → In Progress |
assignee: | nobody → Dragan S. (dragan-s) |
Changed in multipath-tools (Ubuntu): | |
importance: | Undecided → Medium |
Changed in multipath-tools (Ubuntu Trusty): | |
importance: | Undecided → Medium |
tags: | added: patch |
Other threads on the way out:
(gdb) info threads cond_wait@ @GLIBC_ 2.3.2 () sysdeps/ unix/sysv/ linux/x86_ 64/pthread_ cond_wait. S:185 unix/syscall- template. S:81 unix/syscall- template. S:81 64-linux- gnu/libgcc_ s.so.1 sysdeps/ unix/sysv/ linux/x86_ 64/lowlevellock .S:135
Id Target Id Frame
6 Thread 0x7f21ca11e700 (LWP 41583) pthread_
at ../nptl/
5 Thread 0x7f21ca11f840 (LWP 41582) 0x00007f21c9111dfd in nanosleep () at ../sysdeps/
4 Thread 0x7f21ca0b3700 (LWP 41632) (Exiting) 0x00007f21c9145967 in madvise ()
at ../sysdeps/
3 Thread 0x7f21ca0a2700 (LWP 41633) (Exiting) 0x00007f21c732a1d1 in ?? () from /lib/x86_
2 Thread 0x7f21ca10d700 (LWP 41584) (Exiting) __lll_lock_wait ()
at ../nptl/
* 1 Thread 0x7f21ca0c4700 (LWP 41631) 0x00000000004075db in checkerloop (ap=0x1b81040) at main.c:1150
(gdb)