Activity log for bug #1354114

Date Who What changed Old value New value Message
2014-08-07 18:01:28 Rafael David Tinoco bug added bug
2014-08-07 19:13:36 Rafael David Tinoco summary Precise multipath segmentation Fault multipath segmentation Fault (libmultipath: update waiter handling)
2014-08-07 19:25:30 Rafael David Tinoco attachment added precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172183/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff
2014-08-07 19:25:49 Rafael David Tinoco attachment added trusty_multipath-tools_0.4.9-3ubuntu8.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172184/+files/trusty_multipath-tools_0.4.9-3ubuntu8.debdiff
2014-08-07 19:26:07 Rafael David Tinoco attachment added utopic_multipath-tools_0.4.9-3ubuntu9.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172185/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff
2014-08-07 19:29:01 Rafael David Tinoco multipath-tools (Ubuntu): assignee Rafael David Tinoco (inaddy)
2014-08-07 19:29:03 Rafael David Tinoco multipath-tools (Ubuntu): status New Confirmed
2014-08-07 19:37:55 Rafael David Tinoco description It was brought to me (~inaddy) the following situation with multipathd: ##### Program terminated with signal 6, Aborted. #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 (gdb) bt #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 #1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/ libc.so.6 #2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/ libc.so.6 #3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/ libc.so.6 #4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/ libpthread.so.0 #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 #6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204 #7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/ libpthread.so.0 #8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/ libc.so.6 #9 0x0000000000000000 in ?? () -------------------------------------------------------------------------------------------- #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 44 lock(wp> vecs> lock); (gdb) print wp> vecs> lock $1 = {mutex = 0x168c280, depth = 1} In pthread_mutex_lock.c:62 there's an assert that fails: #4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62 62 assert (mutex>_ data._owner == 0); In this run: (gdb) p *wp> vecs> lock> mutex $3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}}, __size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1} so __owner is 49 and not 0. Note that 49 is somewhat strange; it's expected to be a pid_t obtained via pid_t id = THREAD_GETMEM (THREAD_SELF, tid); According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed. The multipath-tools package is up to date (0.4.9-3ubuntu5) I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860 ##### In between Precise's version and Upstream there are the following patches touching waiter.c: d887f4a = signal waiter thread to stop waiting on dm events 5ee9f71 = simplify multipath signal handlers af4fd6d = Fix race condition in stop_waiter_thread() e1fcc59 = multipath: clean up code for stopping the waiter threads 03ec4ef = multipath: fix shutdown crashes 4dfdaf2 = multipath: Update multipath device on show topology c301a3f = Race condition when calling stop_waiter_thread() 96f8146 = libmultipath: update waiter handling This specific one: 96f8146 (libmultipath: update waiter handling) """ The current 'waiter' structure accesses fields which belong to the main 'mpp' structure, which has a totally different lifetime. """ Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening). waiter.c:44 = lock(wp->vecs->lock); [Impact]  * Multipath can cause segmentation fault due to wrong code and can possibly cause user to loose access to multipath devices. [Test Case]  * Working on it. [Regression Potential]  * Fix based on upstream code (96f8146) Tag 0.5.0 already functioning. * Introducing mutex, logic to deal with already dead pthread and other way to access same data (instead of accessing other time lived structure). [Other Info] * Original bug description: ---------------- It was brought to me (~inaddy) the following situation with multipathd: ##### Program terminated with signal 6, Aborted. #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 (gdb) bt #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 #1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/ libc.so.6 #2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/ libc.so.6 #3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/ libc.so.6 #4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/ libpthread.so.0 #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 #6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204 #7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/ libpthread.so.0 #8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/ libc.so.6 #9 0x0000000000000000 in ?? () -------------------------------------------------------------------------------------------- #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 44 lock(wp> vecs> lock); (gdb) print wp> vecs> lock $1 = {mutex = 0x168c280, depth = 1} In pthread_mutex_lock.c:62 there's an assert that fails: #4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62 62 assert (mutex>_ data._owner == 0); In this run: (gdb) p *wp> vecs> lock> mutex $3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}}, __size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1} so __owner is 49 and not 0. Note that 49 is somewhat strange; it's expected to be a pid_t obtained via pid_t id = THREAD_GETMEM (THREAD_SELF, tid); According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed. The multipath-tools package is up to date (0.4.9-3ubuntu5) I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860 ##### In between Precise's version and Upstream there are the following patches touching waiter.c: d887f4a = signal waiter thread to stop waiting on dm events 5ee9f71 = simplify multipath signal handlers af4fd6d = Fix race condition in stop_waiter_thread() e1fcc59 = multipath: clean up code for stopping the waiter threads 03ec4ef = multipath: fix shutdown crashes 4dfdaf2 = multipath: Update multipath device on show topology c301a3f = Race condition when calling stop_waiter_thread() 96f8146 = libmultipath: update waiter handling This specific one: 96f8146 (libmultipath: update waiter handling) """ The current 'waiter' structure accesses fields which belong to the main 'mpp' structure, which has a totally different lifetime. """ Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening). waiter.c:44 = lock(wp->vecs->lock);
2014-08-07 20:03:34 Rafael David Tinoco attachment removed utopic_multipath-tools_0.4.9-3ubuntu9.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172185/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff
2014-08-07 20:03:46 Rafael David Tinoco attachment removed trusty_multipath-tools_0.4.9-3ubuntu8.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172184/+files/trusty_multipath-tools_0.4.9-3ubuntu8.debdiff
2014-08-07 20:03:55 Rafael David Tinoco attachment removed precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172183/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff
2014-08-07 20:32:26 Rafael David Tinoco attachment added precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172257/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff
2014-08-07 20:32:51 Rafael David Tinoco attachment added trusty_multipath-tools_0.4.9-3ubuntu8.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172258/+files/trusty_multipath-tools_0.4.9-3ubuntu8.debdiff
2014-08-07 20:33:18 Rafael David Tinoco attachment added utopic_multipath-tools_0.4.9-3ubuntu9.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172259/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff
2014-08-07 20:36:28 Rafael David Tinoco description [Impact]  * Multipath can cause segmentation fault due to wrong code and can possibly cause user to loose access to multipath devices. [Test Case]  * Working on it. [Regression Potential]  * Fix based on upstream code (96f8146) Tag 0.5.0 already functioning. * Introducing mutex, logic to deal with already dead pthread and other way to access same data (instead of accessing other time lived structure). [Other Info] * Original bug description: ---------------- It was brought to me (~inaddy) the following situation with multipathd: ##### Program terminated with signal 6, Aborted. #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 (gdb) bt #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 #1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/ libc.so.6 #2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/ libc.so.6 #3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/ libc.so.6 #4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/ libpthread.so.0 #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 #6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204 #7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/ libpthread.so.0 #8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/ libc.so.6 #9 0x0000000000000000 in ?? () -------------------------------------------------------------------------------------------- #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 44 lock(wp> vecs> lock); (gdb) print wp> vecs> lock $1 = {mutex = 0x168c280, depth = 1} In pthread_mutex_lock.c:62 there's an assert that fails: #4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62 62 assert (mutex>_ data._owner == 0); In this run: (gdb) p *wp> vecs> lock> mutex $3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}}, __size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1} so __owner is 49 and not 0. Note that 49 is somewhat strange; it's expected to be a pid_t obtained via pid_t id = THREAD_GETMEM (THREAD_SELF, tid); According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed. The multipath-tools package is up to date (0.4.9-3ubuntu5) I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860 ##### In between Precise's version and Upstream there are the following patches touching waiter.c: d887f4a = signal waiter thread to stop waiting on dm events 5ee9f71 = simplify multipath signal handlers af4fd6d = Fix race condition in stop_waiter_thread() e1fcc59 = multipath: clean up code for stopping the waiter threads 03ec4ef = multipath: fix shutdown crashes 4dfdaf2 = multipath: Update multipath device on show topology c301a3f = Race condition when calling stop_waiter_thread() 96f8146 = libmultipath: update waiter handling This specific one: 96f8146 (libmultipath: update waiter handling) """ The current 'waiter' structure accesses fields which belong to the main 'mpp' structure, which has a totally different lifetime. """ Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening). waiter.c:44 = lock(wp->vecs->lock); [Impact]  * Multipath can cause segmentation fault due to wrong code and can    possibly cause user to loose access to multipath devices. [Test Case]  * Working on it. [Regression Potential] * Patch 1/4 tries to fix the issue. Patch 2/4 fixes the 1/4. * Patch 3/4 discovers 1/4 was no good. Patch 4/4 fixes 3/4.  * Fix based on upstream code (96f8146) + subsequent patches. * Followed this code development until the issue was addressed. [Other Info]  * Original bug description: ---------------- It was brought to me (~inaddy) the following situation with multipathd: ##### Program terminated with signal 6, Aborted. #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 (gdb) bt #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 #1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/ libc.so.6 #2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/ libc.so.6 #3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/ libc.so.6 #4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/ libpthread.so.0 #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 #6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204 #7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/ libpthread.so.0 #8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/ libc.so.6 #9 0x0000000000000000 in ?? () -------------------------------------------------------------------------------------------- #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 44 lock(wp> vecs> lock); (gdb) print wp> vecs> lock $1 = {mutex = 0x168c280, depth = 1} In pthread_mutex_lock.c:62 there's an assert that fails: #4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62 62 assert (mutex>_ data._owner == 0); In this run: (gdb) p *wp> vecs> lock> mutex $3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}}, __size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1} so __owner is 49 and not 0. Note that 49 is somewhat strange; it's expected to be a pid_t obtained via pid_t id = THREAD_GETMEM (THREAD_SELF, tid); According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed. The multipath-tools package is up to date (0.4.9-3ubuntu5) I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860 ##### In between Precise's version and Upstream there are the following patches touching waiter.c: d887f4a = signal waiter thread to stop waiting on dm events 5ee9f71 = simplify multipath signal handlers af4fd6d = Fix race condition in stop_waiter_thread() e1fcc59 = multipath: clean up code for stopping the waiter threads 03ec4ef = multipath: fix shutdown crashes 4dfdaf2 = multipath: Update multipath device on show topology c301a3f = Race condition when calling stop_waiter_thread() 96f8146 = libmultipath: update waiter handling This specific one: 96f8146 (libmultipath: update waiter handling) """ The current 'waiter' structure accesses fields which belong to the main 'mpp' structure, which has a totally different lifetime. """ Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening). waiter.c:44 = lock(wp->vecs->lock);
2014-08-08 15:41:29 Ubuntu Foundations Team Bug Bot tags patch
2014-08-08 15:41:36 Ubuntu Foundations Team Bug Bot bug added subscriber Ubuntu Sponsors Team
2014-08-08 16:04:35 Brian Murray nominated for series Ubuntu Trusty
2014-08-08 16:04:35 Brian Murray bug task added multipath-tools (Ubuntu Trusty)
2014-08-08 16:04:35 Brian Murray nominated for series Ubuntu Precise
2014-08-08 16:04:35 Brian Murray bug task added multipath-tools (Ubuntu Precise)
2014-08-08 16:14:39 Rafael David Tinoco description [Impact]  * Multipath can cause segmentation fault due to wrong code and can    possibly cause user to loose access to multipath devices. [Test Case]  * Working on it. [Regression Potential] * Patch 1/4 tries to fix the issue. Patch 2/4 fixes the 1/4. * Patch 3/4 discovers 1/4 was no good. Patch 4/4 fixes 3/4.  * Fix based on upstream code (96f8146) + subsequent patches. * Followed this code development until the issue was addressed. [Other Info]  * Original bug description: ---------------- It was brought to me (~inaddy) the following situation with multipathd: ##### Program terminated with signal 6, Aborted. #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 (gdb) bt #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 #1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/ libc.so.6 #2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/ libc.so.6 #3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/ libc.so.6 #4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/ libpthread.so.0 #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 #6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204 #7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/ libpthread.so.0 #8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/ libc.so.6 #9 0x0000000000000000 in ?? () -------------------------------------------------------------------------------------------- #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 44 lock(wp> vecs> lock); (gdb) print wp> vecs> lock $1 = {mutex = 0x168c280, depth = 1} In pthread_mutex_lock.c:62 there's an assert that fails: #4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62 62 assert (mutex>_ data._owner == 0); In this run: (gdb) p *wp> vecs> lock> mutex $3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}}, __size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1} so __owner is 49 and not 0. Note that 49 is somewhat strange; it's expected to be a pid_t obtained via pid_t id = THREAD_GETMEM (THREAD_SELF, tid); According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed. The multipath-tools package is up to date (0.4.9-3ubuntu5) I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860 ##### In between Precise's version and Upstream there are the following patches touching waiter.c: d887f4a = signal waiter thread to stop waiting on dm events 5ee9f71 = simplify multipath signal handlers af4fd6d = Fix race condition in stop_waiter_thread() e1fcc59 = multipath: clean up code for stopping the waiter threads 03ec4ef = multipath: fix shutdown crashes 4dfdaf2 = multipath: Update multipath device on show topology c301a3f = Race condition when calling stop_waiter_thread() 96f8146 = libmultipath: update waiter handling This specific one: 96f8146 (libmultipath: update waiter handling) """ The current 'waiter' structure accesses fields which belong to the main 'mpp' structure, which has a totally different lifetime. """ Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening). waiter.c:44 = lock(wp->vecs->lock); [Impact]  * Multipath can cause segmentation fault due to wrong code and can    possibly cause user to loose access to multipath devices. [Test Case]  * To use multipath and wait for the problem to occur sometime (inevitable). [Regression Potential]  * Patch 1/4 tries to fix the issue. Patch 2/4 fixes the 1/4.  * Patch 3/4 discovers 1/4 was no good. Patch 4/4 fixes 3/4.  * Fix based on upstream code (96f8146) + subsequent patches.  * Followed this code development until the issue was addressed. [Other Info]  * Original bug description: ---------------- It was brought to me (~inaddy) the following situation with multipathd: ##### Program terminated with signal 6, Aborted. #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 (gdb) bt #0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/ libc.so.6 #1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/ libc.so.6 #2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/ libc.so.6 #3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/ libc.so.6 #4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/ libpthread.so.0 #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 #6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204 #7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/ libpthread.so.0 #8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/ libc.so.6 #9 0x0000000000000000 in ?? () -------------------------------------------------------------------------------------------- #5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44 44 lock(wp> vecs> lock); (gdb) print wp> vecs> lock $1 = {mutex = 0x168c280, depth = 1} In pthread_mutex_lock.c:62 there's an assert that fails: #4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62 62 assert (mutex>_ data._owner == 0); In this run: (gdb) p *wp> vecs> lock> mutex $3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}}, __size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1} so __owner is 49 and not 0. Note that 49 is somewhat strange; it's expected to be a pid_t obtained via pid_t id = THREAD_GETMEM (THREAD_SELF, tid); According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed. The multipath-tools package is up to date (0.4.9-3ubuntu5) I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860 ##### In between Precise's version and Upstream there are the following patches touching waiter.c: d887f4a = signal waiter thread to stop waiting on dm events 5ee9f71 = simplify multipath signal handlers af4fd6d = Fix race condition in stop_waiter_thread() e1fcc59 = multipath: clean up code for stopping the waiter threads 03ec4ef = multipath: fix shutdown crashes 4dfdaf2 = multipath: Update multipath device on show topology c301a3f = Race condition when calling stop_waiter_thread() 96f8146 = libmultipath: update waiter handling This specific one: 96f8146 (libmultipath: update waiter handling) """ The current 'waiter' structure accesses fields which belong to the main 'mpp' structure, which has a totally different lifetime. """ Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening). waiter.c:44 = lock(wp->vecs->lock);
2014-08-08 19:56:14 Rafael David Tinoco bug watch added http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757508
2014-08-08 20:30:13 Rafael David Tinoco attachment removed utopic_multipath-tools_0.4.9-3ubuntu9.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172259/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff
2014-08-08 20:30:40 Rafael David Tinoco attachment removed precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172257/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff
2014-08-08 20:30:49 Rafael David Tinoco attachment removed trusty_multipath-tools_0.4.9-3ubuntu8.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172258/+files/trusty_multipath-tools_0.4.9-3ubuntu8.debdiff
2014-08-08 20:31:19 Rafael David Tinoco attachment added precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4173051/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff
2014-08-08 20:32:13 Rafael David Tinoco attachment added trusty_multipath-tools_0.4.9-3ubuntu8.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4173052/+files/trusty_multipath-tools_0.4.9-3ubuntu8.debdiff
2014-08-08 20:32:38 Rafael David Tinoco attachment added utopic_multipath-tools_0.4.9-3ubuntu9.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4173053/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff
2014-08-29 12:08:35 Rafael David Tinoco bug added subscriber Ubuntu Stable Release Updates Team
2014-09-02 23:06:11 Launchpad Janitor branch linked lp:ubuntu/utopic-proposed/multipath-tools
2014-09-02 23:43:09 Launchpad Janitor multipath-tools (Ubuntu): status Confirmed Fix Released
2014-09-15 12:22:45 Rafael David Tinoco multipath-tools (Ubuntu Trusty): status New Confirmed
2014-09-15 12:22:48 Rafael David Tinoco multipath-tools (Ubuntu Precise): assignee Rafael David Tinoco (inaddy)
2014-09-15 12:22:49 Rafael David Tinoco multipath-tools (Ubuntu Precise): status New Confirmed
2014-09-15 12:22:53 Rafael David Tinoco multipath-tools (Ubuntu Trusty): assignee Rafael David Tinoco (inaddy)
2014-09-23 17:56:36 Marc Deslauriers multipath-tools (Ubuntu Precise): status Confirmed In Progress
2014-09-23 17:56:39 Marc Deslauriers multipath-tools (Ubuntu Trusty): status Confirmed In Progress
2014-09-23 19:00:47 Chris J Arges multipath-tools (Ubuntu Trusty): status In Progress Fix Committed
2014-09-23 19:00:50 Chris J Arges bug added subscriber SRU Verification
2014-09-23 19:00:57 Chris J Arges tags patch patch verification-needed
2014-09-23 19:26:33 Launchpad Janitor branch linked lp:~ubuntu-branches/ubuntu/trusty/multipath-tools/trusty-proposed
2014-09-23 20:00:31 Chris J Arges multipath-tools (Ubuntu Precise): status In Progress Fix Committed
2014-09-23 20:24:18 Launchpad Janitor branch linked lp:ubuntu/precise-proposed/multipath-tools
2014-09-25 15:34:23 Sebastien Bacher removed subscriber Ubuntu Sponsors Team
2014-09-26 20:13:28 Mathew Hodson bug task added multipath-tools (Debian)
2014-09-26 20:13:40 Mathew Hodson multipath-tools (Debian): importance Undecided Unknown
2014-09-26 20:13:40 Mathew Hodson multipath-tools (Debian): status New Unknown
2014-09-26 20:13:40 Mathew Hodson multipath-tools (Debian): remote watch Debian Bug tracker #757508
2014-09-26 23:44:46 Bug Watch Updater multipath-tools (Debian): status Unknown Fix Released
2014-10-06 18:31:16 Mathew Hodson branch linked lp:~inaddy/ubuntu/utopic/multipath-tools/bug-1354114
2014-10-10 23:26:20 Jorge Niedbalski tags patch verification-needed cts patch verification-needed
2014-10-14 19:55:40 Rafael David Tinoco tags cts patch verification-needed cts patch verification-done
2014-10-27 05:03:58 Launchpad Janitor multipath-tools (Ubuntu Precise): status Fix Committed Fix Released
2014-10-27 06:00:38 Mathew Hodson multipath-tools (Ubuntu Trusty): status Fix Committed Fix Released