Activity log for bug #1743637

Date Who What changed Old value New value Message
2018-01-16 19:42:50 Rafael David Tinoco bug added bug
2018-01-16 19:48:57 Rafael David Tinoco qemu (Ubuntu): status New In Progress
2018-01-16 19:49:00 Rafael David Tinoco qemu (Ubuntu): importance Undecided Medium
2018-01-16 19:49:02 Rafael David Tinoco qemu (Ubuntu): assignee Rafael David Tinoco (inaddy)
2018-01-16 19:50:10 Rafael David Tinoco nominated for series Ubuntu Xenial
2018-01-16 19:51:05 Eric Desrochers bug task added qemu (Ubuntu Xenial)
2018-01-16 19:51:21 Rafael David Tinoco qemu (Ubuntu Xenial): status New In Progress
2018-01-16 19:51:25 Rafael David Tinoco qemu (Ubuntu): status In Progress Fix Released
2018-01-16 19:51:28 Rafael David Tinoco qemu (Ubuntu): assignee Rafael David Tinoco (inaddy)
2018-01-16 19:51:30 Rafael David Tinoco qemu (Ubuntu Xenial): assignee Rafael David Tinoco (inaddy)
2018-01-16 19:51:32 Rafael David Tinoco qemu (Ubuntu Xenial): importance Undecided Medium
2018-01-17 23:15:01 Rafael David Tinoco description # BUG Description after dump analysis - The logic net_cleanup calls the vhost_net_stop. - This last one iterates over all vhost networks to stop one by one. - Idea behind is to cleanly do the virtqueue stop, releasing resources. - In order to stop the virtqueue, vhost has to get the vring base address (by sending a msg of VHOST_USER_GET_VERING_BASE) - the char device would read from the socket the base address. - if it reads nothing, the qemu tcp channel driver would disconnect the socket. - when the socket is disconnected, vhost_user stops all the queues to that vhost_user socket. From the dump: By disconnecting charnet2 device we reach the error. Since the char device has already been disconnected, the vhost_user_stop tries to stop all queues but it accidentally treats all of them the same (and charnet4 is a TAP device, not a VHOST USER). #### Logic Error: Here is the charnet2 data at the time of the error: Name : filename (from CharDriverState) Details:0x556a934b0a90 "disconnected:unix:/run/openvswitch/vhostuser-vcic" Default:0x556a934b0a90 "disconnected:unix:/run/openvswitch/vhostuser-vcic" Decimal:93916226062992 Hex:0x556a934b0a90 Binary:10101010110101010010011010010110000101010010000 Octal:02526522322605220 When it realizes the connection is gone it creates an event: qemu_chr_be_event(chr, CHR_EVENT_CLOSED); Which will call: net_vhost_user_event This last function finds all NetClientState using a pointer called "name". The event was originated the device charnet2 and the event callback is running using charnet4, which explains why the bad decision (assert) was made (trying to assert if a TAP device is a VHOST_USER one). #### Possible Fix There is already a commit upstream that might address this: commit c1bf3531aecf4a0ba25bb150dd5fe21edf406c88 Author: Marc-André Lureau <marcandre.lureau@redhat.com> 2016-02-23 18:10:49 Committer: Michael S. Tsirkin <mst@redhat.com> 2016-03-11 14:59:12 Branches: master, origin/HEAD, origin/master, origin/stable-2.10, origin/stable-2.6, origin/stable-2.7, origin/stable-2.8, origin/stable-2.9 vhost-user: fix use after free "name" is freed after visiting options, instead use the first NetClientState name. Adds a few assert() for clarifying and checking some impossible states. [Impact] * vhost-user resources aren't cleaned on QEMU shutdown * this can lead to memory leak (specially bad if hugepages) * QEMU bad assertion blocks the cleanup logic [Test Case] * to use QEMU with vhost-user and stress test the shutdown * eventually the faulty logic (race on the variable "name") will happen [Regression Potential] * based on upstream code * based on core dump analysis * could make qemu vhost-user virtio nic shutdown even worse [Other Info] * Check initial case description: # BUG Description after dump analysis - The logic net_cleanup calls the vhost_net_stop. - This last one iterates over all vhost networks to stop one by one. - Idea behind is to cleanly do the virtqueue stop, releasing resources. - In order to stop the virtqueue, vhost has to get the vring base address   (by sending a msg of VHOST_USER_GET_VERING_BASE) - the char device would read from the socket the base address. - if it reads nothing, the qemu tcp channel driver would disconnect the socket. - when the socket is disconnected, vhost_user stops all the queues to that vhost_user socket. From the dump: By disconnecting charnet2 device we reach the error. Since the char device has already been disconnected, the vhost_user_stop tries to stop all queues but it accidentally treats all of them the same (and charnet4 is a TAP device, not a VHOST USER). #### Logic Error: Here is the charnet2 data at the time of the error: Name : filename (from CharDriverState) Details:0x556a934b0a90 "disconnected:unix:/run/openvswitch/vhostuser-vcic" Default:0x556a934b0a90 "disconnected:unix:/run/openvswitch/vhostuser-vcic" Decimal:93916226062992 Hex:0x556a934b0a90 Binary:10101010110101010010011010010110000101010010000 Octal:02526522322605220 When it realizes the connection is gone it creates an event: qemu_chr_be_event(chr, CHR_EVENT_CLOSED); Which will call: net_vhost_user_event This last function finds all NetClientState using a pointer called "name". The event was originated the device charnet2 and the event callback is running using charnet4, which explains why the bad decision (assert) was made (trying to assert if a TAP device is a VHOST_USER one). #### Possible Fix There is already a commit upstream that might address this: commit c1bf3531aecf4a0ba25bb150dd5fe21edf406c88 Author: Marc-André Lureau <marcandre.lureau@redhat.com> 2016-02-23 18:10:49 Committer: Michael S. Tsirkin <mst@redhat.com> 2016-03-11 14:59:12 Branches: master, origin/HEAD, origin/master, origin/stable-2.10, origin/stable-2.6, origin/stable-2.7, origin/stable-2.8, origin/stable-2.9 vhost-user: fix use after free "name" is freed after visiting options, instead use the first NetClientState name. Adds a few assert() for clarifying and checking some impossible states.
2018-01-17 23:18:44 Rafael David Tinoco attachment added xenial_qemu_2.5+dfsg-5ubuntu10.17.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5038526/+files/xenial_qemu_2.5+dfsg-5ubuntu10.17.debdiff
2018-01-18 12:22:56 Rafael David Tinoco attachment removed xenial_qemu_2.5+dfsg-5ubuntu10.17.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5038526/+files/xenial_qemu_2.5+dfsg-5ubuntu10.17.debdiff
2018-01-18 12:23:54 Rafael David Tinoco attachment added xenial_qemu_2.5+dfsg-5ubuntu10.17.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5038905/+files/xenial_qemu_2.5+dfsg-5ubuntu10.17.debdiff
2018-01-18 15:32:57 Rafael David Tinoco tags sts
2018-01-18 15:33:12 Eric Desrochers bug added subscriber Eric Desrochers
2018-02-28 14:20:27 Chris J Arges qemu (Ubuntu Xenial): status In Progress Fix Committed
2018-02-28 14:20:29 Chris J Arges bug added subscriber Ubuntu Stable Release Updates Team
2018-02-28 14:20:31 Chris J Arges bug added subscriber SRU Verification
2018-02-28 14:20:34 Chris J Arges tags sts sts verification-needed verification-needed-xenial
2018-03-27 14:47:13 Rafael David Tinoco tags sts verification-needed verification-needed-xenial sts
2018-03-27 15:24:12 Rafael David Tinoco attachment removed xenial_qemu_2.5+dfsg-5ubuntu10.17.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5038905/+files/xenial_qemu_2.5+dfsg-5ubuntu10.17.debdiff
2018-03-27 15:28:34 Rafael David Tinoco attachment added xenial_qemu_2.5+dfsg-5ubuntu10.25.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5092432/+files/xenial_qemu_2.5+dfsg-5ubuntu10.25.debdiff
2018-03-27 15:29:41 Rafael David Tinoco bug added subscriber STS Sponsors
2018-03-27 15:29:47 Rafael David Tinoco bug added subscriber Ubuntu Sponsors Team
2018-03-27 15:29:54 Rafael David Tinoco tags sts sts sts-sponsor
2018-03-28 11:00:49 Rafael David Tinoco qemu (Ubuntu Xenial): status Fix Committed In Progress
2018-03-28 11:00:52 Rafael David Tinoco removed subscriber Ubuntu Sponsors Team
2018-03-28 11:00:55 Rafael David Tinoco removed subscriber STS Sponsors
2018-04-12 20:39:48 Rafael David Tinoco attachment removed xenial_qemu_2.5+dfsg-5ubuntu10.25.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5092432/+files/xenial_qemu_2.5+dfsg-5ubuntu10.25.debdiff
2018-04-12 20:55:02 Rafael David Tinoco attachment added xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5112885/+files/xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff
2018-04-26 11:03:06 Rafael David Tinoco attachment removed xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5112885/+files/xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff
2018-04-26 11:20:17 Rafael David Tinoco attachment added xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5127606/+files/xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff
2018-04-26 11:20:47 Rafael David Tinoco attachment removed xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5127606/+files/xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff
2018-04-26 11:25:52 Rafael David Tinoco attachment added xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5127607/+files/xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff
2018-04-26 11:27:10 Rafael David Tinoco attachment removed xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5127607/+files/xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff
2018-04-26 11:31:19 Rafael David Tinoco attachment added xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1743637/+attachment/5127609/+files/xenial_qemu_2.5+dfsg-5ubuntu10.26.debdiff
2018-05-09 12:14:54 Robie Basak qemu (Ubuntu Xenial): status In Progress Fix Committed
2018-05-09 12:15:00 Robie Basak tags sts sts-sponsor sts sts-sponsor verification-needed verification-needed-xenial
2018-05-15 02:19:28 Rafael David Tinoco tags sts sts-sponsor verification-needed verification-needed-xenial sts sts-sponsor verification-done verification-done-xenial
2018-05-16 11:58:04 Launchpad Janitor qemu (Ubuntu Xenial): status Fix Committed Fix Released
2018-05-16 11:58:04 Launchpad Janitor cve linked 2018-7550
2019-03-20 13:13:14 Dan Streetman tags sts sts-sponsor verification-done verification-done-xenial sts verification-done verification-done-xenial