Crash(assert) during reading image from http url through qemu-nbd

Bug #1708442 reported by Andrey Smetanin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
QEMU
Expired
Undecided
Unassigned

Bug Description

Description:
During reading image from nbd device mounted by qemu-nbd server with url backend I/O error happens
"blk_update_request: I/O error, dev nbd0, sector 42117" dmesg. After some investigation I found that qemu-nbd server aborts in aio_co_enter() assert in util/async.c:468.

Steps to reproduce:

1) sudo go run qemu-nbd-bug-report/qemu-nbd-bug.go (see qemu-nbd-bug-report.tar.gz)

or try directly

1) qemu-nbd -c /dev/nbd0 -r -v --aio=native -f qcow2 json:{"file.driver":"http","file.url":"http://localhost:9666/image","file.readahead":3276800
2) try read whole nbd device while error "blk_update_request: I/O error, dev nbd0, sector 42117" appears in dmesg

Versions:

1) qemu built from sources(/configure --target-list=x86_64-softmmu --disable-user --enable-curl --enable-linux-aio --enable-virtfs --enable-debug --disable-pie
, top commit 5619c179057e24195ff19c8fe6d6a6cbcb16ed28):

qemu-nbd -v
qemu-nbd 2.9.90 (v2.10.0-rc0-67-g5619c17)

2) libcurl(built from sources, top commit 1767adf4399bb3be29121435e1bb1cc2bc05f7bf):

curl -V
curl 7.55.0-DEV (Linux) libcurl/7.55.0-DEV OpenSSL/1.0.2g zlib/1.2.8

Backtrace:
(gdb) bt
#0 0x00007f7131426428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
#1 0x00007f713142802a in __GI_abort () at abort.c:89
#2 0x00007f713141ebd7 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x54c924 "self != co",
    file=file@entry=0x54c871 "util/async.c", line=line@entry=468,
    function=function@entry=0x54c980 <__PRETTY_FUNCTION__.24766> "aio_co_enter") at assert.c:92
#3 0x00007f713141ec82 in __GI___assert_fail (assertion=0x54c924 "self != co", file=0x54c871 "util/async.c", line=468,
    function=0x54c980 <__PRETTY_FUNCTION__.24766> "aio_co_enter") at assert.c:101
#4 0x00000000004fe6a2 in aio_co_enter (ctx=0xf0ddb0, co=0xf14650) at util/async.c:468
#5 0x00000000004fe637 in aio_co_wake (co=0xf14650) at util/async.c:456
#6 0x0000000000495c8a in curl_read_cb (ptr=0xf566d9, size=1, nmemb=16135, opaque=0xf1cb90) at block/curl.c:275
#7 0x00007f713242ac24 in Curl_client_chop_write () from /usr/lib/x86_64-linux-gnu/libcurl.so
#8 0x00007f713242ae03 in Curl_client_write () from /usr/lib/x86_64-linux-gnu/libcurl.so
#9 0x00007f713244e1cf in readwrite_data () from /usr/lib/x86_64-linux-gnu/libcurl.so
#10 0x00007f713244eb6f in Curl_readwrite () from /usr/lib/x86_64-linux-gnu/libcurl.so
#11 0x00007f713245c1bb in multi_runsingle () from /usr/lib/x86_64-linux-gnu/libcurl.so
#12 0x00007f713245d819 in multi_socket () from /usr/lib/x86_64-linux-gnu/libcurl.so
#13 0x00007f713245e067 in curl_multi_socket_action () from /usr/lib/x86_64-linux-gnu/libcurl.so
#14 0x0000000000497555 in curl_setup_preadv (bs=0xf16820, acb=0x7f712d379860) at block/curl.c:918
#15 0x00000000004975fb in curl_co_preadv (bs=0xf16820, offset=6556160, bytes=512, qiov=0x7f712d379b40, flags=0) at block/curl.c:935
#16 0x000000000047730f in bdrv_driver_preadv (bs=0xf16820, offset=6556160, bytes=512, qiov=0x7f712d379b40, flags=0) at block/io.c:836
#17 0x0000000000477c1f in bdrv_aligned_preadv (child=0xf1be20, req=0x7f712d379a60, offset=6556160, bytes=512, align=1,
    qiov=0x7f712d379b40, flags=0) at block/io.c:1086
#18 0x0000000000478109 in bdrv_co_preadv (child=0xf1be20, offset=6556160, bytes=512, qiov=0x7f712d379b40, flags=0) at block/io.c:1180
#19 0x0000000000437498 in qcow2_co_preadv (bs=0xf0fdc0, offset=21563904, bytes=512, qiov=0x7f712d379e80, flags=0)
    at block/qcow2.c:1812
#20 0x000000000047730f in bdrv_driver_preadv (bs=0xf0fdc0, offset=21563904, bytes=512, qiov=0x7f712d379e80, flags=0)
    at block/io.c:836
#21 0x0000000000477c1f in bdrv_aligned_preadv (child=0xf1c0d0, req=0x7f712d379d30, offset=21563904, bytes=512, align=1,
    qiov=0x7f712d379e80, flags=0) at block/io.c:1086
#22 0x0000000000478109 in bdrv_co_preadv (child=0xf1c0d0, offset=21563904, bytes=512, qiov=0x7f712d379e80, flags=0)
    at block/io.c:1180
#23 0x00000000004645ad in blk_co_preadv (blk=0xf1be90, offset=21563904, bytes=512, qiov=0x7f712d379e80, flags=0)
    at block/block-backend.c:991
#24 0x00000000004646fa in blk_read_entry (opaque=0x7f712d379ea0) at block/block-backend.c:1038
#25 0x000000000046481c in blk_prw (blk=0xf1be90, offset=21563904,
---Type <return> to continue, or q <return> to quit---
    buf=0xf7f000 "2,NV\241t!\ti\312\vp\364\017Kl*\354\021\a\177\021\260\b\027\212\347\027\004\322\nG\340b\\\306pG\332\313\060\341;\002\360\063L\240\027T \211\341\305\022АE\230\356DǮ}\211\bx\016\a\b\313\350\316\064.\017\372\032-R\376z\261\263\350|cQ<\016S_L\340A\221\366~L#\001+\271\204\065~\327\023\027I\211\343\361\276zT$4\336\273ˏ\353ʪ\234\016_Z|TMk\"\370\002\363~\334\332.\a\375\265mӌ{/%\304֎\374sF<E\371\031o&\202\217\226\276>I\356\302\375F\340\332\324\021\202\232>\026\261\233\303tv\023\304\006\243\037\062BϏ\b\324rs\360'"..., bytes=512, co_entry=0x4646aa <blk_read_entry>, flags=0) at block/block-backend.c:1074
#26 0x0000000000464f81 in blk_pread (blk=0xf1be90, offset=21563904, buf=0xf7f000, count=512) at block/block-backend.c:1227
#27 0x00000000004906cb in nbd_trip (opaque=0xf5a940) at nbd/server.c:1380
#28 0x000000000051c0a5 in coroutine_trampoline (i0=15812176, i1=0) at util/coroutine-ucontext.c:79
#29 0x00007f713143b5d0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#30 0x00007f712d47a770 in ?? ()
#31 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7f712d37a000

Revision history for this message
Andrey Smetanin (asmetanin) wrote :
description: updated
Revision history for this message
Eric Blake (eblake) wrote : Re: [Qemu-devel] [Bug 1708442] [NEW] Crash(assert) during reading image from http url through qemu-nbd
Download full text (3.7 KiB)

On 08/03/2017 07:12 AM, Andrey Smetanin wrote:
> Public bug reported:
>
> Description:
> During reading image from nbd device mounted by qemu-nbd server with url backend I/O error happens
> "blk_update_request: I/O error, dev nbd0, sector 42117" dmesg. After some investigation I found that qemu-nbd server aborts in aio_co_enter() assert in util/async.c:468.
>

Based on the backtrace, this looks to be a bug in the block/curl.c
driver, rather than the nbd/ or block/nbd.c code. If I'm right, it
should be possible to reproduce the crash using qemu-io directly on the
curl path, rather than adding the extra layer of an nbd client reading
through qemu-nbd (then again, having the qemu-nbd layer may be what is
allowing multiple parallel requests to hit the curl driver at once,
while qemu-io is not quite as easy to provoke into performing
complicated access patterns).

>
> Steps to reproduce:
>
> 1) sudo go run qemu-nbd-bug-report/qemu-nbd-bug.go (see qemu-nbd-bug-
> report.tar.gz)
>
> or try directly
>
> 1) qemu-nbd -c /dev/nbd0 -r -v --aio=native -f qcow2 json:{"file.driver":"http","file.url":"http://localhost:9666/image","file.readahead":3276800

Presumably, you've got something serving the file at port 9666?

> 2) try read whole nbd device while error in dmesg appears x
>
> Versions:
>
> 1) qemu built from sources(/configure --target-list=x86_64-softmmu --disable-user --enable-curl --enable-linux-aio --enable-virtfs --enable-debug --disable-pie
> , top commit 5619c179057e24195ff19c8fe6d6a6cbcb16ed28):
>
> qemu-nbd -v
> qemu-nbd 2.9.90 (v2.10.0-rc0-67-g5619c17)
>
> 2) libcurl(built from sources, top commit
> 1767adf4399bb3be29121435e1bb1cc2bc05f7bf):
>
> curl -V
> curl 7.55.0-DEV (Linux) libcurl/7.55.0-DEV OpenSSL/1.0.2g zlib/1.2.8
>
>
> Backtrace:
> (gdb) bt
> #0 0x00007f7131426428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
> #1 0x00007f713142802a in __GI_abort () at abort.c:89
> #2 0x00007f713141ebd7 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x54c924 "self != co",
> file=file@entry=0x54c871 "util/async.c", line=line@entry=468,
> function=function@entry=0x54c980 <__PRETTY_FUNCTION__.24766> "aio_co_enter") at assert.c:92
> #3 0x00007f713141ec82 in __GI___assert_fail (assertion=0x54c924 "self != co", file=0x54c871 "util/async.c", line=468,
> function=0x54c980 <__PRETTY_FUNCTION__.24766> "aio_co_enter") at assert.c:101
> #4 0x00000000004fe6a2 in aio_co_enter (ctx=0xf0ddb0, co=0xf14650) at util/async.c:468
> #5 0x00000000004fe637 in aio_co_wake (co=0xf14650) at util/async.c:456
> #6 0x0000000000495c8a in curl_read_cb (ptr=0xf566d9, size=1, nmemb=16135, opaque=0xf1cb90) at block/curl.c:275
> #7 0x00007f713242ac24 in Curl_client_chop_write () from /usr/lib/x86_64-linux-gnu/libcurl.so
> #8 0x00007f713242ae03 in Curl_client_write () from /usr/lib/x86_64-linux-gnu/libcurl.so
> #9 0x00007f713244e1cf in readwrite_data () from /usr/lib/x86_64-linux-gnu/libcurl.so
> #10 0x00007f713244eb6f in Curl_readwrite () from /usr/lib/x86_64-linux-gnu/libcurl.so
> #11 0x00007f713245c1bb in multi_runsingle () from /usr/lib/x86_64-linux-gnu/libcurl.so
> #12 0x00007f713245d819 in ...

Read more...

Revision history for this message
Andrey Smetanin (asmetanin) wrote :
Download full text (9.8 KiB)

03.08.2017, 17:01, "Eric Blake" <email address hidden>:
> On 08/03/2017 07:12 AM, Andrey Smetanin wrote:
>>  Public bug reported:
>>
>>  Description:
>>  During reading image from nbd device mounted by qemu-nbd server with url backend I/O error happens
>>  "blk_update_request: I/O error, dev nbd0, sector 42117" dmesg. After some investigation I found that qemu-nbd server aborts in aio_co_enter() assert in util/async.c:468.
>
> Based on the backtrace, this looks to be a bug in the block/curl.c
> driver, rather than the nbd/ or block/nbd.c code. If I'm right, it
> should be possible to reproduce the crash using qemu-io directly on the
> curl path, rather than adding the extra layer of an nbd client reading
> through qemu-nbd (then again, having the qemu-nbd layer may be what is
> allowing multiple parallel requests to hit the curl driver at once,
> while qemu-io is not quite as easy to provoke into performing
> complicated access patterns).
>
>>  Steps to reproduce:
>>
>>  1) sudo go run qemu-nbd-bug-report/qemu-nbd-bug.go (see qemu-nbd-bug-
>>  report.tar.gz)
>>
>>  or try directly
>>
>>  1) qemu-nbd -c /dev/nbd0 -r -v --aio=native -f qcow2 json:{"file.driver":"http","file.url":"http://localhost:9666/image","file.readahead":3276800
>
> Presumably, you've got something serving the file at port 9666?
Yes, you are right. I'm using qemu-nbd-bug.go(see qemu-nbd-bug-report.tar.gz) script which does it.
>
>>  2) try read whole nbd device while error in dmesg appears x
>>
>>  Versions:
>>
>>  1) qemu built from sources(/configure --target-list=x86_64-softmmu --disable-user --enable-curl --enable-linux-aio --enable-virtfs --enable-debug --disable-pie
>>  , top commit 5619c179057e24195ff19c8fe6d6a6cbcb16ed28):
>>
>>  qemu-nbd -v
>>  qemu-nbd 2.9.90 (v2.10.0-rc0-67-g5619c17)
>>
>>  2) libcurl(built from sources, top commit
>>  1767adf4399bb3be29121435e1bb1cc2bc05f7bf):
>>
>>  curl -V
>>  curl 7.55.0-DEV (Linux) libcurl/7.55.0-DEV OpenSSL/1.0.2g zlib/1.2.8
>>
>>  Backtrace:
>>  (gdb) bt
>>  #0 0x00007f7131426428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54
>>  #1 0x00007f713142802a in __GI_abort () at abort.c:89
>>  #2 0x00007f713141ebd7 in __assert_fail_base (fmt=<optimized out>, assertion=assertion@entry=0x54c924 "self != co",
>>      file=file@entry=0x54c871 "util/async.c", line=line@entry=468,
>>      function=function@entry=0x54c980 <__PRETTY_FUNCTION__.24766> "aio_co_enter") at assert.c:92
>>  #3 0x00007f713141ec82 in __GI___assert_fail (assertion=0x54c924 "self != co", file=0x54c871 "util/async.c", line=468,
>>      function=0x54c980 <__PRETTY_FUNCTION__.24766> "aio_co_enter") at assert.c:101
>>  #4 0x00000000004fe6a2 in aio_co_enter (ctx=0xf0ddb0, co=0xf14650) at util/async.c:468
>>  #5 0x00000000004fe637 in aio_co_wake (co=0xf14650) at util/async.c:456
>>  #6 0x0000000000495c8a in curl_read_cb (ptr=0xf566d9, size=1, nmemb=16135, opaque=0xf1cb90) at block/curl.c:275
>>  #7 0x00007f713242ac24 in Curl_client_chop_write () from /usr/lib/x86_64-linux-gnu/libcurl.so
>>  #8 0x00007f713242ae03 in Curl_client_write () from /usr/lib/x86_64-linux-gnu/libcurl.so
>>  #9 0x00007f713244e1cf in readwrite_data () from /usr/lib/x8...

Revision history for this message
Andrey Smetanin (asmetanin) wrote :

I've build qemu-nbd-bug.go into binary qemu-nbd-bug (go build qemu-nbd-bug.go), so to reproduce bug anyone may try:
0) ensure image.img exists in current folder. image.img may be copied from "Qemu-nbd core dump and script to reproduce " attachment.
1) sudo ./qemu-nbd-bug
2
) wait while binary aborting(panicking)
3) check dmesg for error like "blk_update_request: I/O error, dev nbd0, sector 42117"
4) check qemu-nbd core dump (since it's aborting due to assert in aio_co_enter(), util/async.c:468).

Revision history for this message
Andrey Smetanin (asmetanin) wrote :

I've attached patch which resolves bug, but actually it's just workaround and may be not unsafe, because it's just ignore assert.

Revision history for this message
Andrey Smetanin (asmetanin) wrote :

ping

Revision history for this message
Thomas Huth (th-huth) wrote :

Looking through old bug tickets... is this still an issue with the latest version of QEMU? Or could we close this ticket nowadays?

Changed in qemu:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for QEMU because there has been no activity for 60 days.]

Changed in qemu:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.