Core dump on multipathd shutdown - trusty 14.04.4

Bug #1616213 reported by Dragan S.
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
multipath-tools (Ubuntu)
Invalid
Medium
Dragan S.
Trusty
Fix Released
Medium
Dragan S.

Bug Description

[Impact]

 * During "service multipath-tools stop" multipath daemon
   is trying to cleanup and shut down several concurrent
   threads. At times depending on a race condition between
   two threads, one thread might free resources that are still
   used by another thread.

   This is causing the multipathd to dump crash core on
   stop events.

 * Fix should be backported to trusty to avoid more support
   issues being filed.

 * This change delays freeing resources that another thread is
   still using.

[Test Case]

 * install multipath-tools, create a basic multipath.conf with
   devices under management. Run: "service multipath-tools start"
   run I/O on devices and keep the system CPU busy, then run
   "service multipath-tools stop".

[Regression Potential]

 * There should be no regression potential with this change,
   this problem happens on the exit path and we are only delaying
   a free call.

[Original Description]

On ubuntu trusty 14.04.4 in multipath-tools version 0.4.9-3ubuntu7.14 there is bug in multipathd on shutdown.

The code will access pathvec pointer which is a valid address:

Reading symbols from /sbin/multipathd...Reading symbols from /usr/lib/debug//sbin/multipathd...done.
done.
[New LWP 41631]
[New LWP 41584]
[New LWP 41633]
[New LWP 41632]
[New LWP 41582]
[New LWP 41583]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/sbin/multipathd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000004075db in checkerloop (ap=0x1b81040) at main.c:1150

1150 vector_foreach_slot (vecs->pathvec, pp, i) {
(gdb) list
1145 pthread_cleanup_push(cleanup_lock, &vecs->lock);
1146 lock(vecs->lock);
1147 condlog(4, "tick");
1148
1149 if (vecs->pathvec) {
1150 vector_foreach_slot (vecs->pathvec, pp, i) {
1151 check_path(vecs, pp);
1152 }
1153 }
1154 if (vecs->mpvec) {

Pathvec is a valid pointer:
(gdb) p vecs->pathvec
$1 = (vector) 0x1b81280

But the contents of the structure are just garbage:

(gdb) p *vecs->pathvec
$2 = {allocated = 1651076143, slot = 0x756e696c2d34365f}
(gdb)

Dragan S. (dragan-s)
Changed in multipath-tools (Ubuntu):
assignee: nobody → Dragan S. (dragan-s)
status: New → In Progress
Revision history for this message
Dragan S. (dragan-s) wrote :

Other threads on the way out:

(gdb) info threads
  Id Target Id Frame
  6 Thread 0x7f21ca11e700 (LWP 41583) pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
  5 Thread 0x7f21ca11f840 (LWP 41582) 0x00007f21c9111dfd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  4 Thread 0x7f21ca0b3700 (LWP 41632) (Exiting) 0x00007f21c9145967 in madvise ()
    at ../sysdeps/unix/syscall-template.S:81
  3 Thread 0x7f21ca0a2700 (LWP 41633) (Exiting) 0x00007f21c732a1d1 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
  2 Thread 0x7f21ca10d700 (LWP 41584) (Exiting) __lll_lock_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
* 1 Thread 0x7f21ca0c4700 (LWP 41631) 0x00000000004075db in checkerloop (ap=0x1b81040) at main.c:1150
(gdb)

Revision history for this message
Dragan S. (dragan-s) wrote :

Thread 5 is on the way out:

(gdb) thread 5
[Switching to thread 5 (Thread 0x7f21ca11f840 (LWP 41582))]
#0 0x00007f21c9111dfd in nanosleep () at ../sysdeps/unix/syscall-template.S:81
81 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) up
#1 0x00007f21c9111c94 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137
137 ../sysdeps/unix/sysv/linux/sleep.c: No such file or directory.
(gdb)
#2 0x00000000004084e6 in child (param=0x0) at main.c:1539
1539 sleep (1); /* This is weak. */
(gdb) list
1534 free_polls();
1535
1536 unlock(vecs->lock);
1537 /* Now all the waitevent threads will start rushing in. */
1538 while (vecs->lock.depth > 0) {
1539 sleep (1); /* This is weak. */
1540 condlog(3,"Have %d wait event checkers threads to de-alloc, waiting..\n", vecs->lock.depth);
1541 }
1542 pthread_mutex_destroy(vecs->lock.mutex);
1543 FREE(vecs->lock.mutex);

The code in thread 5 before the sleep(1) is:
(gdb) list -
1524
1525 pthread_cancel(check_thr);
1526 pthread_cancel(uevent_thr);
1527 pthread_cancel(uxlsnr_thr);
1528 pthread_cancel(uevq_thr);
1529
1530 free_keys(keys);
1531 keys = NULL;
1532 free_handlers(handlers);
1533 handlers = NULL;
(gdb) list -
1514 /*
1515 * exit path
1516 */
1517 block_signal(SIGHUP, NULL);
1518 lock(vecs->lock);
1519 if (conf->queue_without_daemon == QUE_NO_DAEMON_OFF)
1520 vector_foreach_slot(vecs->mpvec, mpp, i)
1521 dm_queue_if_no_path(mpp->alias, 0);
1522 remove_maps_and_stop_waiters(vecs);
1523 free_pathvec(vecs->pathvec, FREE_PATHS);
(gdb)

So as you can see at line 1523, thread 5 freed vecs->pathvec. Which thread 1 is accessing:
1150 vector_foreach_slot (vecs->pathvec, pp, i) {

Louis Bouchard (louis)
Changed in multipath-tools (Ubuntu Trusty):
status: New → Incomplete
status: Incomplete → In Progress
assignee: nobody → Dragan S. (dragan-s)
Changed in multipath-tools (Ubuntu):
importance: Undecided → Medium
Changed in multipath-tools (Ubuntu Trusty):
importance: Undecided → Medium
Revision history for this message
Dragan S. (dragan-s) wrote :
Mathew Hodson (mhodson)
tags: added: patch
Revision history for this message
Dragan S. (dragan-s) wrote :
description: updated
Revision history for this message
Louis Bouchard (louis) wrote :

Marking dev-release as invalid, that code section was refactored and the code of the patch is in the new code path

Changed in multipath-tools (Ubuntu):
status: In Progress → Invalid
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Dragan, or anyone else affected,

Accepted multipath-tools into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/multipath-tools/0.4.9-3ubuntu7.15 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in multipath-tools (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Dragan S. (dragan-s) wrote :

I pulled multipath-tools 0.4.9-3ubuntu7.15 and verified that the fix is there.

tags: added: verification-done
removed: verification-needed
Revision history for this message
Dragan S. (dragan-s) wrote :

User also verified issue as fixed with 0.4.9-3ubuntu7.15

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package multipath-tools - 0.4.9-3ubuntu7.15

---------------
multipath-tools (0.4.9-3ubuntu7.15) trusty; urgency=medium

  * d/p/0047-Add-existing-multipath-devices-to-wwids-file-on.patch:
    Fix multipathd which does not update /etc/multipath/wwids file
    when reconfigure is invoked. (LP: #1621835)

  [ Dragan Stancevic ]
  * d/p/0048-multipathd-delay-free-pathvec.patch :
    Fix SEGV on multipathd shutdown (LP: #1616213)

  [ Nishanth Aravamudan ]
  * debian/patches/fix_use_after_free.patch: Fix use-after-free bugs.
    Thanks to Christof Schmitt <email address hidden> and
    Benjamin Marzinski <email address hidden>. Closes LP: #1628723.

 -- Louis Bouchard <email address hidden> Mon, 12 Sep 2016 10:43:19 +0200

Changed in multipath-tools (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Chris J Arges (arges) wrote : Update Released

The verification of the Stable Release Update for multipath-tools has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.