On Mon, Oct 07, 2019 at 01:06:20PM +0200, Paolo Bonzini wrote: > On 02/10/19 11:23, Jan Glauber wrote: > > I've looked into this on ThunderX2. The arm64 code generated for the > > atomic_[add|sub] accesses of ctx->notify_me doesn't contain any > > memory barriers. It is just plain ldaxr/stlxr. > > > > From my understanding this is not sufficient for SMP sync. > > > > If I read this comment correct: > > > > void aio_notify(AioContext *ctx) > > { > > /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs > > * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll. > > */ > > smp_mb(); > > if (ctx->notify_me) { > > > > it points out that the smp_mb() should be paired. But as > > I said the used atomics don't generate any barriers at all. > > Based on the rest of the thread, this patch should also fix the bug: > > diff --git a/util/async.c b/util/async.c > index 47dcbfa..721ea53 100644 > --- a/util/async.c > +++ b/util/async.c > @@ -249,7 +249,7 @@ aio_ctx_check(GSource *source) > aio_notify_accept(ctx); > > for (bh = ctx->first_bh; bh; bh = bh->next) { > - if (bh->scheduled) { > + if (atomic_mb_read(&bh->scheduled)) { > return true; > } > } > > > And also the memory barrier in aio_notify can actually be replaced > with a SEQ_CST load: > > diff --git a/util/async.c b/util/async.c > index 47dcbfa..721ea53 100644 > --- a/util/async.c > +++ b/util/async.c > @@ -349,11 +349,11 @@ LinuxAioState *aio_get_linux_aio(AioContext *ctx) > > void aio_notify(AioContext *ctx) > { > - /* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs > - * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll. > + /* Using atomic_mb_read ensures that e.g. bh->scheduled is written before > + * ctx->notify_me is read. Pairs with atomic_or in aio_ctx_prepare or > + * atomic_add in aio_poll. > */ > - smp_mb(); > - if (ctx->notify_me) { > + if (atomic_mb_read(&ctx->notify_me)) { > event_notifier_set(&ctx->notifier); > atomic_mb_set(&ctx->notified, true); > } > > > Would you be able to test these (one by one possibly)? Paolo, I tried them both separately and together on a Hi1620 system, each time it hung in the first iteration. Here's a backtrace of a run with both patches applied: (gdb) thread apply all bt Thread 3 (Thread 0xffff8154b820 (LWP 63900)): #0 0x0000ffff8b9402cc in __GI___sigtimedwait (set=, set@entry=0xaaaaf1e08070, info=info@entry=0xffff8154ad98, timeout=timeout@entry=0x0) at ../sysdeps/unix/sysv/linux/sigtimedwait.c:42 #1 0x0000ffff8ba77fac in __sigwait (set=set@entry=0xaaaaf1e08070, sig=sig@entry=0xffff8154ae74) at ../sysdeps/unix/sysv/linux/sigwait.c:28 #2 0x0000aaaab7dc1610 in sigwait_compat (opaque=0xaaaaf1e08070) at util/compatfd.c:35 #3 0x0000aaaab7dc3e80 in qemu_thread_start (args=) at util/qemu-thread-posix.c:519 #4 0x0000ffff8ba6d088 in start_thread (arg=0xffffceefbf4f) at pthread_create.c:463 #5 0x0000ffff8b9dd4ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78 Thread 2 (Thread 0xffff81d4c820 (LWP 63899)): #0 syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38 #1 0x0000aaaab7dc4cd8 in qemu_futex_wait (val=, f=) at /home/ubuntu/qemu/include/qemu/futex.h:29 #2 qemu_event_wait (ev=ev@entry=0xaaaab7e48708 ) at util/qemu-thread-posix.c:459 #3 0x0000aaaab7ddf44c in call_rcu_thread (opaque=) at util/rcu.c:260 #4 0x0000aaaab7dc3e80 in qemu_thread_start (args=) at util/qemu-thread-posix.c:519 #5 0x0000ffff8ba6d088 in start_thread (arg=0xffffceefc05f) at pthread_create.c:463 #6 0x0000ffff8b9dd4ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78 Thread 1 (Thread 0xffff81e83010 (LWP 63898)): #0 0x0000ffff8b9d4154 in __GI_ppoll (fds=0xaaaaf1e0dbc0, nfds=187650205809964, timeout=, timeout@entry=0x0, sigmask=0xffffceefbef0) at ../sysdeps/unix/sysv/linux/ppoll.c:39 #1 0x0000aaaab7dbedb0 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, __fds=) at /usr/include/aarch64-linux-gnu/bits/poll2.h:77 #2 qemu_poll_ns (fds=, nfds=, timeout=timeout@entry=-1) at util/qemu-timer.c:340 #3 0x0000aaaab7dbfd2c in os_host_main_loop_wait (timeout=-1) at util/main-loop.c:236 #4 main_loop_wait (nonblocking=) at util/main-loop.c:517 #5 0x0000aaaab7ce86e8 in convert_do_copy (s=0xffffceefc068) at qemu-img.c:2028 #6 img_convert (argc=, argv=) at qemu-img.c:2520 #7 0x0000aaaab7ce1e54 in main (argc=8, argv=) at qemu-img.c:5097 > > I've tried to verify me theory with this patch and didn't run into the > > issue for ~500 iterations (usually I would trigger the issue ~20 iterations). > > Sorry for asking the obvious---500 iterations of what? $ for i in $(seq 1 500); do echo "==$i=="; ./qemu/qemu-img convert -p -f qcow2 -O qcow2 bionic-server-cloudimg-arm64.img out.img; done ==1== (37.19/100%) -dann