Concurrency bug on keyboard events: capslock LED messing up keycode streams causes character misses at guest kernel

Bug #1809075 reported by Gao on 2018-12-19
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
QEMU
Undecided
Unassigned

Bug Description

Whenever capslock is pressed on host, both capslock keycode(0x3a 0xba) and capslock LED keycode(0xfa 0xfa) would be sent to the ps2 keycode stream.

However, capslock LED is handled by another thread, confirmed by tracing `ps2_write_keyboard` with gdb. The keycode of casplock LED might divide

For example, I sent AaBb but got ABa. I was using vncdotool, so it equals sending `capslock a capslock a capslock b capslock b`. In ps2_queue, I was expecting `3a fa fa ba 1e 9e 3a fa fa ba 1e 9e 3a fa fa ba 30 b0 3a fa fa ba 30 b0`. But actually once in a while, it might not receive such streams. In one case I got `3a fa fa ba 1e 9e 3a ba 1e fa fa 9e 3a ba 30 b0 3a ba 30 b0 fa fa`

In this specific example, `a` was lost because LED keycode 'jumps in' its keycode. Kernel event device receives below streams
```
# /dev/input/event receives what is sent from ps2_queue
# I use cap_1 to show capslock key down
cap_1 led caps_0, # 0x3a 0xfa 0xfa 0xba
a_1 a_0 # 0x1e 0x9e
caps_1 caps_0 # 0x3a 0xba
led # 0x1e 0xfa 0xfa 0x1e (we lost `a` here)
caps_1 caps_0 # 0x3a 0xba
b_1 led b_0 # 0x30 0xfa 0xfa 0xb0
caps_1 caps_0 # 0x3a 0xba
led b_1 b_0 # 0xfa 0xfa 0x30 0xb0
```

I made sure kernel receives the correct key stream as the qemu ps2_queue sends using /proc, ftrace and dynamic_debug. I explained the details in this [post](https://medium.com/@alapha23/quick-peek-into-kernel-land-keyboard-events-handling-with-ftrace-and-dynamic-debug-24a790056d5a)

So it seems to be a concurrency issue.

A hacky path on my mind is to skip all `0xfa` in ps2_queue. But I'm not sure if capslock LED is the only stink bug to our ps2 keycode queue as I've seen other keycodes handled by `ps2_write_keyboard` sent to ps2 queue.

Another solution might be a memory barrier or a lock. Making key down and key up atomic will prevent another thread modifying the ps2 queue unwantedly.

What do you think?

Gao (alapha23) wrote :

### Reproduce steps

Add `fprintf(stderr, "ps2_queue 0x%x\n", b);` to `hw/input/ps2.c` and re-build qemu.

- qemu-system-x86_64 -hda <your img> --enable-kvm -m <> -display vnc=:1
- vncviewer -Shared :5901

In guest os, find the keyboard device(very likely to be /dev/input/event0)
```
sudo evtest /dev/input/event0
```

On host OS
- vncdotool -s 127.0.0.1::5901 type AaBb
Finally,
- record what evtest has received and compared with expected key streams.

Around once out of five times, we can find keycode lost due to capslock LED.

Please do not rely on graphics mode output as there are also key loss bugs when wayland internals deal with kernel keyboard events.

A simply note on some conversion between keycode and keys. Hopefully it would come in handy in debugging:
a 0x1e 0x9e
b 0x30 0xb0
c 0x2e 0xae
d 0x20 0xa0
capslock 0x3a 0xba
capslock LED 0xfa 0xfa
ret 0x1c 0x9c
leftshift 0x2a 0xaa

Gerd Hoffmann (kraxel-redhat) wrote :

There is no "capslock LED key event". 0xfa is KBD_REPLY_ACK, and the
device queues it in response to guest port writes. Yes, the ack can
race with actual key events. But IMO that isn't a bug in qemu.

Probably the linux kernel just throws away everything until it got the
ack for the port write, and that way the key event gets lost. On
physical hardware you will not notice because it is next to impossible
to type fast enough to hit the race window.

So, go fix the kernel.

Alternatively fix vncdotool to send uppercase letters properly with
shift key pressed. Then qemu wouldn't generate capslock key events
(that happens because qemu thinks guest and host capslock state is out
of sync) and the guests's capslock led update request wouldn't get into
the way.

Changed in qemu:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers