pull-lp-source linux 4.4.0-18.34
Build from source with oldconfig and such
Enable all kind of debug for virtio
Add some checks where we expect it to fail
mkdir /home/ubuntu/4.4.0-debug
# not needed make INSTALL_MOD_PATH=/home/ubuntu/4.4.0-debug modules_install
make INSTALL_PATH=/home/ubuntu/4.4.0-debug install
Attach debugger as before and retrigger the bug
Ensure /home/ubuntu/linux-4.4.0/scripts/gdb/vmlinux-gdb.py gets loaded properly for helpers
On boot my debug starts to work on the one device that gets innitialized on boot:
[ 3.557697] __virtqueue_get_buf: Entry checks passed - vq ffff8800bbae6400 from _vq ffff8800bbae6400
[ 3.559320] __virtqueue_get_buf: Exit checks passed - ffff8801b74b2840 vq->data[i]
[ 3.560515] __virtqueue_get_buf: Returning ret ffff8801b74b2840
pull-lp-source linux 4.4.0-18.34 4.4.0-debug MOD_PATH= /home/ubuntu/ 4.4.0-debug modules_install PATH=/home/ ubuntu/ 4.4.0-debug install
Build from source with oldconfig and such
Enable all kind of debug for virtio
Add some checks where we expect it to fail
mkdir /home/ubuntu/
# not needed make INSTALL_
make INSTALL_
<kernel> /home/ubuntu/ linux-4. 4.0/vmlinuz- 4.4.6</ kernel> root=/dev/ vda1 console=tty1 console=ttyS0 net.ifnames= 0</cmdline>
<cmdline>
Attach debugger as before and retrigger the bug linux-4. 4.0/scripts/ gdb/vmlinux- gdb.py gets loaded properly for helpers
Ensure /home/ubuntu/
On boot my debug starts to work on the one device that gets innitialized on boot: get_buf: Entry checks passed - vq ffff8800bbae6400 from _vq ffff8800bbae6400 get_buf: Exit checks passed - ffff8801b74b2840 vq->data[i] get_buf: Returning ret ffff8801b74b2840
[ 3.557697] __virtqueue_
[ 3.559320] __virtqueue_
[ 3.560515] __virtqueue_
Prep issue: num-mbufs= 2048
sudo /usr/bin/testpmd --pci-blacklist 0000:00:03.0 --socket-mem 2048 -- --interactive --total-
* it might be worth to mention that nothing regarding the queues came by running testpmd - neither in console nor in gdb
Trigger hang:
sudo ethtool -L eth1 combined 3
__virtqueue_ is_broken: - vq ffff8800bbae7000 from _vq ffff8800bbae7000 -> broken 0 is_broken: - vq ffff8800bbae7000 from _vq ffff8800bbae7000 -> broken 0
__virtqueue_
[...]
With the debug we have we can check the vvq's status
BTW - the offset of that container_of is 0 - so we can just cast it :-/
$4 = {vq = {list = {next = 0xffff8800bb892b00, prev = 0xffff8801b7518 000}, callback = 0x0 <irq_stack_union>, name = 0xffffffff81d0f164 "control", 000}, weak_barriers = true, broken = false, indirect = true, event = true, free_head = 1, num_added = 0, last_used_idx = 0, flags_shadow = 1, avail_idx_shadow = 1, notify = 0xffffffff814bca40 <vp_notify>, data = 0xffff8800bbae7078}
vdev = 0xffff8800bb892800, index = 8, num_free = 63, priv = 0x1c010}, vring = {num = 64, desc = 0xffff8801b7514000, avail = 0xffff8801b7514400,
used = 0xffff8801b7515
avail_
So it considers itself not broken.
But I've seen it run over the usually disabled (so we don't see it by default):
pr_debug("No more buffers in queue\n");
That depends on !more_used(vq) to_cpu( vq->vq. vdev, vq->vring. used->idx) ;
Which is:
return vq->last_used_idx != virtio16_
0 != 0
(gdb) p ((struct vring_virtqueue *)0xffff8800bba e7000)- >vring e7000)- >vring. used e7000)- >vring. avail e7000)- >vring. desc
$19 = {num = 64, desc = 0xffff8801b7514000, avail = 0xffff8801b7514400, used = 0xffff8801b7515000}
(gdb) p *((struct vring_virtqueue *)0xffff8800bba
$21 = {flags = 0, idx = 0, ring = 0xffff8801b7515004}
(gdb) p *((struct vring_virtqueue *)0xffff8800bba
$22 = {flags = 1, idx = 1, ring = 0xffff8801b7514404}
(gdb) p *((struct vring_virtqueue *)0xffff8800bba
$23 = {addr = 3140568064, len = 48, flags = 4, next = 1}
0!=0 => false -> so more_used returns fals
But the call said !more_used, so virtqueue_get_buf returns NULL - and that is all it does "forever".