2014-08-07 18:01:28 |
Rafael David Tinoco |
bug |
|
|
added bug |
2014-08-07 19:13:36 |
Rafael David Tinoco |
summary |
Precise multipath segmentation Fault |
multipath segmentation Fault (libmultipath: update waiter handling) |
|
2014-08-07 19:25:30 |
Rafael David Tinoco |
attachment added |
|
precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172183/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff |
|
2014-08-07 19:25:49 |
Rafael David Tinoco |
attachment added |
|
trusty_multipath-tools_0.4.9-3ubuntu8.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172184/+files/trusty_multipath-tools_0.4.9-3ubuntu8.debdiff |
|
2014-08-07 19:26:07 |
Rafael David Tinoco |
attachment added |
|
utopic_multipath-tools_0.4.9-3ubuntu9.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172185/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff |
|
2014-08-07 19:29:01 |
Rafael David Tinoco |
multipath-tools (Ubuntu): assignee |
|
Rafael David Tinoco (inaddy) |
|
2014-08-07 19:29:03 |
Rafael David Tinoco |
multipath-tools (Ubuntu): status |
New |
Confirmed |
|
2014-08-07 19:37:55 |
Rafael David Tinoco |
description |
It was brought to me (~inaddy) the following situation with multipathd:
#####
Program terminated with signal 6, Aborted.
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
(gdb) bt
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
#1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/
libc.so.6
#2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/
libc.so.6
#3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/
libc.so.6
#4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/
libpthread.so.0
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
#6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204
#7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/
libpthread.so.0
#8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/
libc.so.6
#9 0x0000000000000000 in ?? ()
--------------------------------------------------------------------------------------------
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
44 lock(wp>
vecs>
lock);
(gdb) print wp>
vecs>
lock
$1 = {mutex = 0x168c280, depth = 1}
In pthread_mutex_lock.c:62 there's an assert that fails:
#4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62
62 assert (mutex>_
data._owner == 0);
In this run:
(gdb) p *wp>
vecs>
lock>
mutex
$3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}},
__size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1}
so __owner is 49 and not 0.
Note that 49 is somewhat strange; it's expected to be a pid_t obtained via
pid_t id = THREAD_GETMEM (THREAD_SELF, tid);
According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed.
The multipath-tools package is up to date (0.4.9-3ubuntu5)
I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be
http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860
#####
In between Precise's version and Upstream there are the following patches touching waiter.c:
d887f4a = signal waiter thread to stop waiting on dm events
5ee9f71 = simplify multipath signal handlers
af4fd6d = Fix race condition in stop_waiter_thread()
e1fcc59 = multipath: clean up code for stopping the waiter threads
03ec4ef = multipath: fix shutdown crashes
4dfdaf2 = multipath: Update multipath device on show topology
c301a3f = Race condition when calling stop_waiter_thread()
96f8146 = libmultipath: update waiter handling
This specific one: 96f8146 (libmultipath: update waiter handling)
"""
The current 'waiter' structure accesses fields which belong
to the main 'mpp' structure, which has a totally different
lifetime.
"""
Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening).
waiter.c:44 = lock(wp->vecs->lock); |
[Impact]
* Multipath can cause segmentation fault due to wrong code and can
possibly cause user to loose access to multipath devices.
[Test Case]
* Working on it.
[Regression Potential]
* Fix based on upstream code (96f8146) Tag 0.5.0 already functioning.
* Introducing mutex, logic to deal with already dead pthread and other
way to access same data (instead of accessing other time lived
structure).
[Other Info]
* Original bug description:
----------------
It was brought to me (~inaddy) the following situation with multipathd:
#####
Program terminated with signal 6, Aborted.
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
(gdb) bt
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
#1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/
libc.so.6
#2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/
libc.so.6
#3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/
libc.so.6
#4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/
libpthread.so.0
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
#6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204
#7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/
libpthread.so.0
#8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/
libc.so.6
#9 0x0000000000000000 in ?? ()
--------------------------------------------------------------------------------------------
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
44 lock(wp>
vecs>
lock);
(gdb) print wp>
vecs>
lock
$1 = {mutex = 0x168c280, depth = 1}
In pthread_mutex_lock.c:62 there's an assert that fails:
#4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62
62 assert (mutex>_
data._owner == 0);
In this run:
(gdb) p *wp>
vecs>
lock>
mutex
$3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}},
__size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1}
so __owner is 49 and not 0.
Note that 49 is somewhat strange; it's expected to be a pid_t obtained via
pid_t id = THREAD_GETMEM (THREAD_SELF, tid);
According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed.
The multipath-tools package is up to date (0.4.9-3ubuntu5)
I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be
http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860
#####
In between Precise's version and Upstream there are the following patches touching waiter.c:
d887f4a = signal waiter thread to stop waiting on dm events
5ee9f71 = simplify multipath signal handlers
af4fd6d = Fix race condition in stop_waiter_thread()
e1fcc59 = multipath: clean up code for stopping the waiter threads
03ec4ef = multipath: fix shutdown crashes
4dfdaf2 = multipath: Update multipath device on show topology
c301a3f = Race condition when calling stop_waiter_thread()
96f8146 = libmultipath: update waiter handling
This specific one: 96f8146 (libmultipath: update waiter handling)
"""
The current 'waiter' structure accesses fields which belong
to the main 'mpp' structure, which has a totally different
lifetime.
"""
Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening).
waiter.c:44 = lock(wp->vecs->lock); |
|
2014-08-07 20:03:34 |
Rafael David Tinoco |
attachment removed |
utopic_multipath-tools_0.4.9-3ubuntu9.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172185/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff |
|
|
2014-08-07 20:03:46 |
Rafael David Tinoco |
attachment removed |
trusty_multipath-tools_0.4.9-3ubuntu8.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172184/+files/trusty_multipath-tools_0.4.9-3ubuntu8.debdiff |
|
|
2014-08-07 20:03:55 |
Rafael David Tinoco |
attachment removed |
precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172183/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff |
|
|
2014-08-07 20:32:26 |
Rafael David Tinoco |
attachment added |
|
precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172257/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff |
|
2014-08-07 20:32:51 |
Rafael David Tinoco |
attachment added |
|
trusty_multipath-tools_0.4.9-3ubuntu8.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172258/+files/trusty_multipath-tools_0.4.9-3ubuntu8.debdiff |
|
2014-08-07 20:33:18 |
Rafael David Tinoco |
attachment added |
|
utopic_multipath-tools_0.4.9-3ubuntu9.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172259/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff |
|
2014-08-07 20:36:28 |
Rafael David Tinoco |
description |
[Impact]
* Multipath can cause segmentation fault due to wrong code and can
possibly cause user to loose access to multipath devices.
[Test Case]
* Working on it.
[Regression Potential]
* Fix based on upstream code (96f8146) Tag 0.5.0 already functioning.
* Introducing mutex, logic to deal with already dead pthread and other
way to access same data (instead of accessing other time lived
structure).
[Other Info]
* Original bug description:
----------------
It was brought to me (~inaddy) the following situation with multipathd:
#####
Program terminated with signal 6, Aborted.
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
(gdb) bt
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
#1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/
libc.so.6
#2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/
libc.so.6
#3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/
libc.so.6
#4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/
libpthread.so.0
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
#6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204
#7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/
libpthread.so.0
#8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/
libc.so.6
#9 0x0000000000000000 in ?? ()
--------------------------------------------------------------------------------------------
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
44 lock(wp>
vecs>
lock);
(gdb) print wp>
vecs>
lock
$1 = {mutex = 0x168c280, depth = 1}
In pthread_mutex_lock.c:62 there's an assert that fails:
#4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62
62 assert (mutex>_
data._owner == 0);
In this run:
(gdb) p *wp>
vecs>
lock>
mutex
$3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}},
__size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1}
so __owner is 49 and not 0.
Note that 49 is somewhat strange; it's expected to be a pid_t obtained via
pid_t id = THREAD_GETMEM (THREAD_SELF, tid);
According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed.
The multipath-tools package is up to date (0.4.9-3ubuntu5)
I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be
http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860
#####
In between Precise's version and Upstream there are the following patches touching waiter.c:
d887f4a = signal waiter thread to stop waiting on dm events
5ee9f71 = simplify multipath signal handlers
af4fd6d = Fix race condition in stop_waiter_thread()
e1fcc59 = multipath: clean up code for stopping the waiter threads
03ec4ef = multipath: fix shutdown crashes
4dfdaf2 = multipath: Update multipath device on show topology
c301a3f = Race condition when calling stop_waiter_thread()
96f8146 = libmultipath: update waiter handling
This specific one: 96f8146 (libmultipath: update waiter handling)
"""
The current 'waiter' structure accesses fields which belong
to the main 'mpp' structure, which has a totally different
lifetime.
"""
Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening).
waiter.c:44 = lock(wp->vecs->lock); |
[Impact]
* Multipath can cause segmentation fault due to wrong code and can
possibly cause user to loose access to multipath devices.
[Test Case]
* Working on it.
[Regression Potential]
* Patch 1/4 tries to fix the issue. Patch 2/4 fixes the 1/4.
* Patch 3/4 discovers 1/4 was no good. Patch 4/4 fixes 3/4.
* Fix based on upstream code (96f8146) + subsequent patches.
* Followed this code development until the issue was addressed.
[Other Info]
* Original bug description:
----------------
It was brought to me (~inaddy) the following situation with multipathd:
#####
Program terminated with signal 6, Aborted.
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
(gdb) bt
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
#1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/
libc.so.6
#2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/
libc.so.6
#3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/
libc.so.6
#4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/
libpthread.so.0
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
#6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204
#7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/
libpthread.so.0
#8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/
libc.so.6
#9 0x0000000000000000 in ?? ()
--------------------------------------------------------------------------------------------
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
44 lock(wp>
vecs>
lock);
(gdb) print wp>
vecs>
lock
$1 = {mutex = 0x168c280, depth = 1}
In pthread_mutex_lock.c:62 there's an assert that fails:
#4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62
62 assert (mutex>_
data._owner == 0);
In this run:
(gdb) p *wp>
vecs>
lock>
mutex
$3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}},
__size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1}
so __owner is 49 and not 0.
Note that 49 is somewhat strange; it's expected to be a pid_t obtained via
pid_t id = THREAD_GETMEM (THREAD_SELF, tid);
According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed.
The multipath-tools package is up to date (0.4.9-3ubuntu5)
I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be
http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860
#####
In between Precise's version and Upstream there are the following patches touching waiter.c:
d887f4a = signal waiter thread to stop waiting on dm events
5ee9f71 = simplify multipath signal handlers
af4fd6d = Fix race condition in stop_waiter_thread()
e1fcc59 = multipath: clean up code for stopping the waiter threads
03ec4ef = multipath: fix shutdown crashes
4dfdaf2 = multipath: Update multipath device on show topology
c301a3f = Race condition when calling stop_waiter_thread()
96f8146 = libmultipath: update waiter handling
This specific one: 96f8146 (libmultipath: update waiter handling)
"""
The current 'waiter' structure accesses fields which belong
to the main 'mpp' structure, which has a totally different
lifetime.
"""
Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening).
waiter.c:44 = lock(wp->vecs->lock); |
|
2014-08-08 15:41:29 |
Ubuntu Foundations Team Bug Bot |
tags |
|
patch |
|
2014-08-08 15:41:36 |
Ubuntu Foundations Team Bug Bot |
bug |
|
|
added subscriber Ubuntu Sponsors Team |
2014-08-08 16:04:35 |
Brian Murray |
nominated for series |
|
Ubuntu Trusty |
|
2014-08-08 16:04:35 |
Brian Murray |
bug task added |
|
multipath-tools (Ubuntu Trusty) |
|
2014-08-08 16:04:35 |
Brian Murray |
nominated for series |
|
Ubuntu Precise |
|
2014-08-08 16:04:35 |
Brian Murray |
bug task added |
|
multipath-tools (Ubuntu Precise) |
|
2014-08-08 16:14:39 |
Rafael David Tinoco |
description |
[Impact]
* Multipath can cause segmentation fault due to wrong code and can
possibly cause user to loose access to multipath devices.
[Test Case]
* Working on it.
[Regression Potential]
* Patch 1/4 tries to fix the issue. Patch 2/4 fixes the 1/4.
* Patch 3/4 discovers 1/4 was no good. Patch 4/4 fixes 3/4.
* Fix based on upstream code (96f8146) + subsequent patches.
* Followed this code development until the issue was addressed.
[Other Info]
* Original bug description:
----------------
It was brought to me (~inaddy) the following situation with multipathd:
#####
Program terminated with signal 6, Aborted.
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
(gdb) bt
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
#1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/
libc.so.6
#2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/
libc.so.6
#3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/
libc.so.6
#4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/
libpthread.so.0
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
#6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204
#7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/
libpthread.so.0
#8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/
libc.so.6
#9 0x0000000000000000 in ?? ()
--------------------------------------------------------------------------------------------
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
44 lock(wp>
vecs>
lock);
(gdb) print wp>
vecs>
lock
$1 = {mutex = 0x168c280, depth = 1}
In pthread_mutex_lock.c:62 there's an assert that fails:
#4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62
62 assert (mutex>_
data._owner == 0);
In this run:
(gdb) p *wp>
vecs>
lock>
mutex
$3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}},
__size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1}
so __owner is 49 and not 0.
Note that 49 is somewhat strange; it's expected to be a pid_t obtained via
pid_t id = THREAD_GETMEM (THREAD_SELF, tid);
According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed.
The multipath-tools package is up to date (0.4.9-3ubuntu5)
I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be
http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860
#####
In between Precise's version and Upstream there are the following patches touching waiter.c:
d887f4a = signal waiter thread to stop waiting on dm events
5ee9f71 = simplify multipath signal handlers
af4fd6d = Fix race condition in stop_waiter_thread()
e1fcc59 = multipath: clean up code for stopping the waiter threads
03ec4ef = multipath: fix shutdown crashes
4dfdaf2 = multipath: Update multipath device on show topology
c301a3f = Race condition when calling stop_waiter_thread()
96f8146 = libmultipath: update waiter handling
This specific one: 96f8146 (libmultipath: update waiter handling)
"""
The current 'waiter' structure accesses fields which belong
to the main 'mpp' structure, which has a totally different
lifetime.
"""
Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening).
waiter.c:44 = lock(wp->vecs->lock); |
[Impact]
* Multipath can cause segmentation fault due to wrong code and can
possibly cause user to loose access to multipath devices.
[Test Case]
* To use multipath and wait for the problem to occur sometime (inevitable).
[Regression Potential]
* Patch 1/4 tries to fix the issue. Patch 2/4 fixes the 1/4.
* Patch 3/4 discovers 1/4 was no good. Patch 4/4 fixes 3/4.
* Fix based on upstream code (96f8146) + subsequent patches.
* Followed this code development until the issue was addressed.
[Other Info]
* Original bug description:
----------------
It was brought to me (~inaddy) the following situation with multipathd:
#####
Program terminated with signal 6, Aborted.
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
(gdb) bt
#0 0x00007fbc6ae09445 in raise () from /lib/x86_64linuxgnu/
libc.so.6
#1 0x00007fbc6ae0cbab in abort () from /lib/x86_64linuxgnu/
libc.so.6
#2 0x00007fbc6ae0210e in ?? () from /lib/x86_64linuxgnu/
libc.so.6
#3 0x00007fbc6ae021b2 in __assert_fail () from /lib/x86_64linuxgnu/
libc.so.6
#4 0x00007fbc6b849efb in pthread_mutex_lock () from /lib/x86_64linuxgnu/
libpthread.so.0
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
#6 0x00007fbc6b1cc25a in waitevent (et=0x1691de0) at waiter.c:204
#7 0x00007fbc6b847e9a in start_thread () from /lib/x86_64linuxgnu/
libpthread.so.0
#8 0x00007fbc6aec54bd in clone () from /lib/x86_64linuxgnu/
libc.so.6
#9 0x0000000000000000 in ?? ()
--------------------------------------------------------------------------------------------
#5 0x00007fbc6b1cba5f in free_waiter (data=0x1691de0) at waiter.c:44
44 lock(wp>
vecs>
lock);
(gdb) print wp>
vecs>
lock
$1 = {mutex = 0x168c280, depth = 1}
In pthread_mutex_lock.c:62 there's an assert that fails:
#4 0x00007fbc6b849efb in __pthread_mutex_lock (mutex=0xfefefefefefefeff) at pthread_mutex_lock.c:62
62 assert (mutex>_
data._owner == 0);
In this run:
(gdb) p *wp>
vecs>
lock>
mutex
$3 = {_data = {lock = 1, __count = 0, __owner = 49, __nusers = 0, __kind = 0, __spins = 0, __list = {_prev = 0x0, __next = 0xffffffff}},
__size = "\001\000\000\000\000\000\000\000\061", '\000' <repeats 23 times>"\377, \377\377\377\000\000\000", __align = 1}
so __owner is 49 and not 0.
Note that 49 is somewhat strange; it's expected to be a pid_t obtained via
pid_t id = THREAD_GETMEM (THREAD_SELF, tid);
According to https://bugzilla.redhat.com/show_bug.cgi?id=570278 , this assert failure could be an expected behaviour if, for some reason the multipath code was trying to release a mutex that has already been freed.
The multipath-tools package is up to date (0.4.9-3ubuntu5)
I do not find obvious thing related in http://git.opensvc.com/gitweb.cgi?p=multipath-tools%2F.git except may be
http://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=commitdiff;h=5ee9f716549d913aeb9800041f78ee9c6a50d860
#####
In between Precise's version and Upstream there are the following patches touching waiter.c:
d887f4a = signal waiter thread to stop waiting on dm events
5ee9f71 = simplify multipath signal handlers
af4fd6d = Fix race condition in stop_waiter_thread()
e1fcc59 = multipath: clean up code for stopping the waiter threads
03ec4ef = multipath: fix shutdown crashes
4dfdaf2 = multipath: Update multipath device on show topology
c301a3f = Race condition when calling stop_waiter_thread()
96f8146 = libmultipath: update waiter handling
This specific one: 96f8146 (libmultipath: update waiter handling)
"""
The current 'waiter' structure accesses fields which belong
to the main 'mpp' structure, which has a totally different
lifetime.
"""
Shows that due to different lifetime between different structures, there can be use-after-free segfaults (what seems to be happening).
waiter.c:44 = lock(wp->vecs->lock); |
|
2014-08-08 19:56:14 |
Rafael David Tinoco |
bug watch added |
|
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757508 |
|
2014-08-08 20:30:13 |
Rafael David Tinoco |
attachment removed |
utopic_multipath-tools_0.4.9-3ubuntu9.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172259/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff |
|
|
2014-08-08 20:30:40 |
Rafael David Tinoco |
attachment removed |
precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172257/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff |
|
|
2014-08-08 20:30:49 |
Rafael David Tinoco |
attachment removed |
trusty_multipath-tools_0.4.9-3ubuntu8.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4172258/+files/trusty_multipath-tools_0.4.9-3ubuntu8.debdiff |
|
|
2014-08-08 20:31:19 |
Rafael David Tinoco |
attachment added |
|
precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4173051/+files/precise_multipath-tools_0.4.9-3ubuntu5.2.debdiff |
|
2014-08-08 20:32:13 |
Rafael David Tinoco |
attachment added |
|
trusty_multipath-tools_0.4.9-3ubuntu8.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4173052/+files/trusty_multipath-tools_0.4.9-3ubuntu8.debdiff |
|
2014-08-08 20:32:38 |
Rafael David Tinoco |
attachment added |
|
utopic_multipath-tools_0.4.9-3ubuntu9.debdiff https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1354114/+attachment/4173053/+files/utopic_multipath-tools_0.4.9-3ubuntu9.debdiff |
|
2014-08-29 12:08:35 |
Rafael David Tinoco |
bug |
|
|
added subscriber Ubuntu Stable Release Updates Team |
2014-09-02 23:06:11 |
Launchpad Janitor |
branch linked |
|
lp:ubuntu/utopic-proposed/multipath-tools |
|
2014-09-02 23:43:09 |
Launchpad Janitor |
multipath-tools (Ubuntu): status |
Confirmed |
Fix Released |
|
2014-09-15 12:22:45 |
Rafael David Tinoco |
multipath-tools (Ubuntu Trusty): status |
New |
Confirmed |
|
2014-09-15 12:22:48 |
Rafael David Tinoco |
multipath-tools (Ubuntu Precise): assignee |
|
Rafael David Tinoco (inaddy) |
|
2014-09-15 12:22:49 |
Rafael David Tinoco |
multipath-tools (Ubuntu Precise): status |
New |
Confirmed |
|
2014-09-15 12:22:53 |
Rafael David Tinoco |
multipath-tools (Ubuntu Trusty): assignee |
|
Rafael David Tinoco (inaddy) |
|
2014-09-23 17:56:36 |
Marc Deslauriers |
multipath-tools (Ubuntu Precise): status |
Confirmed |
In Progress |
|
2014-09-23 17:56:39 |
Marc Deslauriers |
multipath-tools (Ubuntu Trusty): status |
Confirmed |
In Progress |
|
2014-09-23 19:00:47 |
Chris J Arges |
multipath-tools (Ubuntu Trusty): status |
In Progress |
Fix Committed |
|
2014-09-23 19:00:50 |
Chris J Arges |
bug |
|
|
added subscriber SRU Verification |
2014-09-23 19:00:57 |
Chris J Arges |
tags |
patch |
patch verification-needed |
|
2014-09-23 19:26:33 |
Launchpad Janitor |
branch linked |
|
lp:~ubuntu-branches/ubuntu/trusty/multipath-tools/trusty-proposed |
|
2014-09-23 20:00:31 |
Chris J Arges |
multipath-tools (Ubuntu Precise): status |
In Progress |
Fix Committed |
|
2014-09-23 20:24:18 |
Launchpad Janitor |
branch linked |
|
lp:ubuntu/precise-proposed/multipath-tools |
|
2014-09-25 15:34:23 |
Sebastien Bacher |
removed subscriber Ubuntu Sponsors Team |
|
|
|
2014-09-26 20:13:28 |
Mathew Hodson |
bug task added |
|
multipath-tools (Debian) |
|
2014-09-26 20:13:40 |
Mathew Hodson |
multipath-tools (Debian): importance |
Undecided |
Unknown |
|
2014-09-26 20:13:40 |
Mathew Hodson |
multipath-tools (Debian): status |
New |
Unknown |
|
2014-09-26 20:13:40 |
Mathew Hodson |
multipath-tools (Debian): remote watch |
|
Debian Bug tracker #757508 |
|
2014-09-26 23:44:46 |
Bug Watch Updater |
multipath-tools (Debian): status |
Unknown |
Fix Released |
|
2014-10-06 18:31:16 |
Mathew Hodson |
branch linked |
|
lp:~inaddy/ubuntu/utopic/multipath-tools/bug-1354114 |
|
2014-10-10 23:26:20 |
Jorge Niedbalski |
tags |
patch verification-needed |
cts patch verification-needed |
|
2014-10-14 19:55:40 |
Rafael David Tinoco |
tags |
cts patch verification-needed |
cts patch verification-done |
|
2014-10-27 05:03:58 |
Launchpad Janitor |
multipath-tools (Ubuntu Precise): status |
Fix Committed |
Fix Released |
|
2014-10-27 06:00:38 |
Mathew Hodson |
multipath-tools (Ubuntu Trusty): status |
Fix Committed |
Fix Released |
|