Comment 15 for bug 1805256

QEMU BUG: #1

Alright, one of the issues is (according to comment #14):

"""
Meaning that code is waiting for a futex inside kernel.

(gdb) print rcu_call_ready_event
$4 = {value = 4294967295, initialized = true}

The QemuEvent "rcu_call_ready_event->value" is set to INT_MAX and I don't know why yet.

rcu_call_ready_event->value is only touched by:

qemu_event_init() -> bool init ? EV_SET : EV_FREE
qemu_event_reset() -> atomic_or(&ev->value, EV_FREE)
qemu_event_set() -> atomic_xchg(&ev->value, EV_SET)
qemu_event_wait() -> atomic_cmpxchg(&ev->value, EV_FREE, EV_BUSY)'
"""

Now I know why rcu_call_ready_event->value is set to INT_MAX. That is because in the following declaration:

struct QemuEvent {
#ifndef __linux__
    pthread_mutex_t lock;
    pthread_cond_t cond;
#endif
    unsigned value;
    bool initialized;
};

#define EV_SET 0
#define EV_FREE 1
#define EV_BUSY -1

"value" is declared as unsigned, but EV_BUSY sets it to -1, and, according to the Two's Complement Operation (https://en.wikipedia.org/wiki/Two%27s_complement), it will be INT_MAX (4294967295).

So this is the "first bug" found AND it is definitely funny that this hasn't been seen in other architectures at all... I can reproduce it at will.

With that said, it seems that there is still another issue causing (less frequently):

(gdb) thread 2
[Switching to thread 2 (Thread 0xffffbec5ad90 (LWP 17459))]
#0 syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
38 ../sysdeps/unix/sysv/linux/aarch64/syscall.S: No such file or directory.
(gdb) bt
#0 syscall () at ../sysdeps/unix/sysv/linux/aarch64/syscall.S:38
#1 0x0000aaaaaabd41cc in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at ./util/qemu-thread-posix.c:438
#2 qemu_event_wait (ev=ev@entry=0xaaaaaac86ce8 <rcu_call_ready_event>) at ./util/qemu-thread-posix.c:442
#3 0x0000aaaaaabed05c in call_rcu_thread (opaque=opaque@entry=0x0) at ./util/rcu.c:261
#4 0x0000aaaaaabd34c8 in qemu_thread_start (args=<optimized out>) at ./util/qemu-thread-posix.c:498
#5 0x0000ffffbf25c880 in start_thread (arg=0xfffffffff5bf) at pthread_create.c:486
#6 0x0000ffffbf1b6b9c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Thread 2 to be stuck at "futex()" kernel syscall (like the FUTEX_WAKE never happened and/or wasn't atomic for this arch/binary). Need to investigate this also.