qemu-system-arm and qemu-system-aarch64 QMP hangs after kernel boots
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
QEMU |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
After booting a Linux kernel on both arm and aarch64, the QMP sockets gets unresponsive. Initially, this was thought to be limited to "quit" commands, but it reproduced with others (such as in this
reproducer). This is a partial log output:
>>> {'execute': 'qmp_capabilities'}
<<< {'return': {}}
Booting Linux on physical CPU 0x0000000000 [0x410fd034]
Linux version 4.18.16-
...
Policy zone: DMA32
Kernel command line: printk.time=0 console=ttyAMA0
>>> {'execute': 'stop'}
<<< {'timestamp': {'seconds': 1558370331, 'microseconds': 470173}, 'event': 'STOP'}
<<< {'return': {}}
>>> {'execute': 'cont'}
<<< {'timestamp': {'seconds': 1558370331, 'microseconds': 470849}, 'event': 'RESUME'}
<<< {'return': {}}
>>> {'execute': 'stop'}
Sometimes it takes just the first "stop" command. Overall, I was able to reproduce 100% of times when applied on top of 6d8e75d41c58892
The reproducer test can be seen/fetched at:
- https:/
And test results from Travis CI can be seen at:
- https:/
For convenience purposes, here's qemu-system-aarch64 launching and hanging on the first "stop":
- https:/
- https:/
And here's qemu-system-arm hanging the very same way:
- https:/
- https:/
description: | updated |
Changed in qemu: | |
status: | New → Confirmed |
Changed in qemu: | |
status: | Confirmed → Fix Released |
I have an update on this. Eric and myself attempted to zero in the
exact cause. A few things we discovered:
1) It has nothing to do with having a kernel running
2) It has to do with having a chardev that is a server socket. This
test produces command line arguments such as:
-chardev socket, id=console, path=<path> .sock,server, nowait \
-serial chardev:console
3) It doesn't seem to have a connection to the test infrastructure code qemu/qmp/ *), as a I made a number of experiments which
(python/
yielded no differences in behavior.
So, the reproducer given at:
https:/ /github. com/clebergnu/ qemu/commit/ c778e28c24030c4 a36548b714293b3 19f4bf18df
Continues to be be valid (and continues to be limited to arm and aarch64).
Now, after a number of experiments, the following was found to be a 100%
reproducible *workaround*:
https:/ /github. com/clebergnu/ qemu/commit/ e1713f3b91972ad 57c089f276c54db 3f3fa63423
That basically shutdowns the *console* socket before proceeding with further QMP
interaction. The effectiveness of the workaround can be seen here:
aarch64 command line: /travis- ci.org/ clebergnu/ qemu/jobs/ 535459499# L3633 /travis- ci.org/ clebergnu/ qemu/jobs/ 535459499# L3663
- https:/
aarch64 QMP interaction:
- https:/
arm command line: /travis- ci.org/ clebergnu/ qemu/jobs/ 535459499# L3747 /travis- ci.org/ clebergnu/ qemu/jobs/ 535459499# L3767
- https:/
arm QMP interaction:
- https:/
I hope this provides a few more hints into the real issue.