Verified with jammy-proposed with a
local VM with virtio-net multiqueue.
Host: lunar, 4 pCPUs
Guest: jammy, 4 vCPUs (1 socket/die, 2 cores, 2 threads/core) with 1 virtio-net interface with 4 queues
Setup:
---
$ uvt-simplestreams-libvirt sync release=jammy arch=amd64
$ uvt-kvm create --no-start irqjammy release=jammy
$ virsh edit irqjamy
...
4
...
...
...
...
Start VM:
$ virsh start irqjammy
$ uvt-kvm wait irqjammy
$ uvt-kvm ssh irqjammy
Configure irqbalance:
$ echo "IRQBALANCE_ARGS='--debug --interval 5'" | sudo tee -a /etc/default/irqbalance
$ sudo systemctl restart irqbalance.service
$ systemctl status irqbalance.service | grep debug
└─1561 /usr/sbin/irqbalance --foreground --debug --interval 5
Test with jammy-release:
---
Run in tmux (keep running across connection drops)
$ tmux
Start iperf server in the VM
$ sudo apt update && sudo apt install -y iperf
$ iperf -s >/dev/null &
(*) Start iperf client in the HOST
$ iperf -c $(uvt-kvm ip irqjammy) -t 0 -P4 -i 5 | grep SUM
...
Remove/add virtio-net modules:
$ date; sudo modprobe -r virtio-net && sleep 5 && sudo modprobe virtio-net; date
Sat Oct 7 06:40:39 PM UTC 2023
Sat Oct 7 06:40:45 PM UTC 2023
Notice the connection drop in the client.
Let the client run for another minute and stop it.
$ iperf -c $(uvt-kvm ip irqjammy) -t 0 -P4 -i 5 | grep SUM
WARNING: client will send traffic forever or until an external signal (e.g. SIGINT or SIGTERM) occurs to stop it
[SUM] 0.0000-5.0000 sec 8.27 GBytes 14.2 Gbits/sec
[SUM] 5.0000-10.0000 sec 8.78 GBytes 15.1 Gbits/sec
[SUM] 10.0000-15.0000 sec 9.19 GBytes 15.8 Gbits/sec
[SUM] 15.0000-20.0000 sec 9.13 GBytes 15.7 Gbits/sec
[SUM] 20.0000-25.0000 sec 8.05 GBytes 13.8 Gbits/sec
[SUM] 25.0000-30.0000 sec 8.60 GBytes 14.8 Gbits/sec
[SUM] 30.0000-35.0000 sec 1.53 GBytes 2.64 Gbits/sec <<< DROP
[SUM] 35.0000-40.0000 sec 4.79 GBytes 8.23 Gbits/sec <<< DROP
[SUM] 40.0000-45.0000 sec 9.00 GBytes 15.5 Gbits/sec
[SUM] 45.0000-50.0000 sec 7.93 GBytes 13.6 Gbits/sec
[SUM] 50.0000-55.0000 sec 9.15 GBytes 15.7 Gbits/sec
[SUM] 55.0000-60.0000 sec 9.00 GBytes 15.5 Gbits/sec
[SUM] 60.0000-65.0000 sec 9.05 GBytes 15.5 Gbits/sec
[SUM] 65.0000-70.0000 sec 8.89 GBytes 15.3 Gbits/sec
[SUM] 70.0000-75.0000 sec 9.30 GBytes 16.0 Gbits/sec
[SUM] 75.0000-80.0000 sec 9.14 GBytes 15.7 Gbits/sec
[SUM] 80.0000-85.0000 sec 9.17 GBytes 15.8 Gbits/sec
[SUM] 85.0000-90.0000 sec 8.48 GBytes 14.6 Gbits/sec
[SUM] 90.0000-95.0000 sec 9.11 GBytes 15.6 Gbits/sec
[SUM] 95.0000-100.0000 sec 9.12 GBytes 15.7 Gbits/sec
[SUM] 100.0000-105.0000 sec 8.34 GBytes 14.3 Gbits/sec
^C
Check the irqbalance logs for the network receive queues IRQs:
$ grep virtio.-input /proc/interrupts
50: 3401 39990 0 0 PCI-MSI 524289-edge virtio0-input.0
52: 0 142952 0 1 PCI-MSI 524291-edge virtio0-input.1
54: 0 40018 3404 0 PCI-MSI 524293-edge virtio0-input.2
56: 0 39921 0 3290 PCI-MSI 524295-edge virtio0-input.3
$ journalctl -b -u irqbalance.service | grep -v 'Interrupt .* ([^e]' | grep -B1 -e 'Interrupt \(50\|52\|54\|56\) .* (ethernet' -e 'removed' -e 'Hotplug' | less
...
Initially spread over all vCPUs:
Oct 07 18:36:33 irqjammy irqbalance[1561]: CPU number 3 numa_node is 0 (load 70000000)
Oct 07 18:36:33 irqjammy irqbalance[1561]: Interrupt 54 node_num is -1 (ethernet/26704400:808)
--
Oct 07 18:36:33 irqjammy irqbalance[1561]: CPU number 2 numa_node is 0 (load 40000000)
Oct 07 18:36:33 irqjammy irqbalance[1561]: Interrupt 50 node_num is -1 (ethernet/20986410:670)
--
Oct 07 18:36:33 irqjammy irqbalance[1561]: CPU number 1 numa_node is 0 (load 80000000)
Oct 07 18:36:33 irqjammy irqbalance[1561]: Interrupt 56 node_num is -1 (ethernet/79999757:1091)
--
Oct 07 18:36:33 irqjammy irqbalance[1561]: CPU number 0 numa_node is 0 (load 80000000)
Oct 07 18:36:33 irqjammy irqbalance[1561]: Interrupt 52 node_num is -1 (ethernet/45327784:604)
And remain in all vCPUs while load is introduced:
Oct 07 18:40:38 irqjammy irqbalance[1561]: CPU number 3 numa_node is 0 (load 300000000)
Oct 07 18:40:38 irqjammy irqbalance[1561]: Interrupt 54 node_num is -1 (ethernet/105720768:7608)
--
Oct 07 18:40:43 irqjammy irqbalance[1561]: CPU number 2 numa_node is 0 (load 210000000)
Oct 07 18:40:43 irqjammy irqbalance[1561]: Interrupt 50 node_num is -1 (ethernet/111159048:7644)
--
Oct 07 18:40:43 irqjammy irqbalance[1561]: CPU number 1 numa_node is 0 (load 230000000)
Oct 07 18:40:43 irqjammy irqbalance[1561]: Interrupt 56 node_num is -1 (ethernet/229994976:7712)
--
Oct 07 18:40:43 irqjammy irqbalance[1561]: CPU number 0 numa_node is 0 (load 260000000)
Oct 07 18:40:43 irqjammy irqbalance[1561]: Interrupt 52 node_num is -1 (ethernet/138690006:7714)
After remove/add, the receive IRQs go to single vCPU:
Oct 07 18:40:43 irqjammy irqbalance[1561]: IRQ 55 is removed from interrupts_db.
Oct 07 18:40:43 irqjammy irqbalance[1561]: IRQ 53 is removed from interrupts_db.
Oct 07 18:40:43 irqjammy irqbalance[1561]: IRQ 51 is removed from interrupts_db.
Oct 07 18:40:43 irqjammy irqbalance[1561]: IRQ 56 is removed from interrupts_db.
Oct 07 18:40:43 irqjammy irqbalance[1561]: IRQ 54 is removed from interrupts_db.
Oct 07 18:40:43 irqjammy irqbalance[1561]: IRQ 52 is removed from interrupts_db.
Oct 07 18:40:43 irqjammy irqbalance[1561]: IRQ 50 is removed from interrupts_db.
Oct 07 18:40:43 irqjammy irqbalance[1561]: IRQ 49 is removed from interrupts_db.
Oct 07 18:40:43 irqjammy irqbalance[1561]: IRQ 57 is removed from interrupts_db.
--
Oct 07 18:40:48 irqjammy irqbalance[1561]: Adding IRQ 49 to database
Oct 07 18:40:48 irqjammy irqbalance[1561]: Hotplug dev irq: 49 finished.
Oct 07 18:40:48 irqjammy irqbalance[1561]: Adding IRQ 50 to database
Oct 07 18:40:48 irqjammy irqbalance[1561]: Hotplug dev irq: 50 finished.
Oct 07 18:40:48 irqjammy irqbalance[1561]: Adding IRQ 51 to database
Oct 07 18:40:48 irqjammy irqbalance[1561]: Hotplug dev irq: 51 finished.
Oct 07 18:40:48 irqjammy irqbalance[1561]: Adding IRQ 52 to database
Oct 07 18:40:48 irqjammy irqbalance[1561]: Hotplug dev irq: 52 finished.
Oct 07 18:40:48 irqjammy irqbalance[1561]: Adding IRQ 53 to database
Oct 07 18:40:48 irqjammy irqbalance[1561]: Hotplug dev irq: 53 finished.
Oct 07 18:40:48 irqjammy irqbalance[1561]: Adding IRQ 54 to database
Oct 07 18:40:48 irqjammy irqbalance[1561]: Hotplug dev irq: 54 finished.
Oct 07 18:40:48 irqjammy irqbalance[1561]: Adding IRQ 55 to database
Oct 07 18:40:48 irqjammy irqbalance[1561]: Hotplug dev irq: 55 finished.
Oct 07 18:40:48 irqjammy irqbalance[1561]: Adding IRQ 56 to database
Oct 07 18:40:48 irqjammy irqbalance[1561]: Hotplug dev irq: 56 finished.
Oct 07 18:40:48 irqjammy irqbalance[1561]: Adding IRQ 57 to database
Oct 07 18:40:48 irqjammy irqbalance[1561]: Hotplug dev irq: 57 finished.
Oct 07 18:40:48 irqjammy irqbalance[1561]: CPU number 1 numa_node is 0 (load 60000000)
Oct 07 18:40:48 irqjammy irqbalance[1561]: Interrupt 56 node_num is -1 (ethernet/0:3274)
Oct 07 18:40:48 irqjammy irqbalance[1561]: Interrupt 54 node_num is -1 (ethernet/0:3391)
Oct 07 18:40:48 irqjammy irqbalance[1561]: Interrupt 52 node_num is -1 (ethernet/0:3167)
Oct 07 18:40:48 irqjammy irqbalance[1561]: Interrupt 50 node_num is -1 (ethernet/0:3389)
And remain there until the end of the test (load drops)
Oct 07 18:42:03 irqjammy irqbalance[1561]: CPU number 1 numa_node is 0 (load 740000000)
Oct 07 18:42:03 irqjammy irqbalance[1561]: Interrupt 52 node_num is -1 (ethernet/404439188:4703)
Oct 07 18:42:03 irqjammy irqbalance[1561]: Interrupt 54 node_num is -1 (ethernet/114460676:1331)
Oct 07 18:42:03 irqjammy irqbalance[1561]: Interrupt 56 node_num is -1 (ethernet/112310776:1306)
Oct 07 18:42:03 irqjammy irqbalance[1561]: Interrupt 50 node_num is -1 (ethernet/108784940:1265)
--
Oct 07 18:42:08 irqjammy irqbalance[1561]: CPU number 1 numa_node is 0 (load 0)
Oct 07 18:42:08 irqjammy irqbalance[1561]: Interrupt 52 node_num is -1 (ethernet/1:3)
Oct 07 18:42:08 irqjammy irqbalance[1561]: Interrupt 54 node_num is -1 (ethernet/1:0)
Oct 07 18:42:08 irqjammy irqbalance[1561]: Interrupt 56 node_num is -1 (ethernet/1:0)
Oct 07 18:42:08 irqjammy irqbalance[1561]: Interrupt 50 node_num is -1 (ethernet/1:0)
No rebalancing happened:
$ journalctl -b -u irqbalance.service | grep 'Selecting irq .* for rebalancing'
$
Test with jammy-proposed:
---
Add proposed, install, reboot to start fresh (as previous test).
$ sudo add-apt-repository -y -p proposed
$ sudo apt install -y irqbalance
...
Get:1 http://archive.ubuntu.com/ubuntu jammy-proposed/main amd64 irqbalance amd64 1.8.0-1ubuntu0.1 [47.1 kB]
...
$ sudo reboot
Repeat the steps above:
$ uvt-kvm ssh irqjammy
$ tmux
$ iperf -s >/dev/null &
HOST $ iperf -c $(uvt-kvm ip irqjammy) -t 0 -P4 -i 5 | grep SUM
...
$ date; sudo modprobe -r virtio-net && sleep 5 && sudo modprobe virtio-net; date
Sat Oct 7 06:54:09 PM UTC 2023
Sat Oct 7 06:54:14 PM UTC 2023
HOST $ iperf -c $(uvt-kvm ip irqjammy) -t 0 -P4 -i 5 | grep SUM
WARNING: client will send traffic forever or until an external signal (e.g. SIGINT or SIGTERM) occurs to stop it
[SUM] 0.0000-5.0000 sec 8.26 GBytes 14.2 Gbits/sec
[SUM] 5.0000-10.0000 sec 8.29 GBytes 14.2 Gbits/sec
[SUM] 10.0000-15.0000 sec 8.48 GBytes 14.6 Gbits/sec
[SUM] 15.0000-20.0000 sec 8.11 GBytes 13.9 Gbits/sec
[SUM] 20.0000-25.0000 sec 8.52 GBytes 14.6 Gbits/sec
[SUM] 25.0000-30.0000 sec 8.71 GBytes 15.0 Gbits/sec
[SUM] 30.0000-35.0000 sec 1.89 GBytes 3.24 Gbits/sec <<< DROP
[SUM] 35.0000-40.0000 sec 3.86 GBytes 6.63 Gbits/sec <<< DROP
[SUM] 40.0000-45.0000 sec 8.15 GBytes 14.0 Gbits/sec
[SUM] 45.0000-50.0000 sec 8.48 GBytes 14.6 Gbits/sec
[SUM] 50.0000-55.0000 sec 7.76 GBytes 13.3 Gbits/sec
[SUM] 55.0000-60.0000 sec 7.98 GBytes 13.7 Gbits/sec
[SUM] 60.0000-65.0000 sec 8.18 GBytes 14.0 Gbits/sec
[SUM] 65.0000-70.0000 sec 8.51 GBytes 14.6 Gbits/sec
[SUM] 70.0000-75.0000 sec 9.24 GBytes 15.9 Gbits/sec
[SUM] 75.0000-80.0000 sec 9.34 GBytes 16.0 Gbits/sec
[SUM] 80.0000-85.0000 sec 9.24 GBytes 15.9 Gbits/sec
[SUM] 85.0000-90.0000 sec 9.37 GBytes 16.1 Gbits/sec
$ grep virtio.-input /proc/interrupts
40: 3509 62273 11199 1 PCI-MSI 524289-edge virtio0-input.0
42: 0 3345 67099 0 PCI-MSI 524291-edge virtio0-input.1
44: 72262 1 12327 0 PCI-MSI 524293-edge virtio0-input.2
46: 0 0 14453 54988 PCI-MSI 524295-edge virtio0-input.3
$ journalctl -b -u irqbalance.service | grep -v 'Interrupt .* ([^e]' | grep -B1 -e 'Interrupt \(40\|42\|44\|46\) .* (ethernet' -e 'removed' -e 'Hotplug' | less
Initially spread:
Oct 07 18:53:43 irqjammy irqbalance[600]: CPU number 3 numa_node is 0 (load 0)
Oct 07 18:53:43 irqjammy irqbalance[600]: Interrupt 44 node_num is -1 (ethernet/1:0)
--
Oct 07 18:53:43 irqjammy irqbalance[600]: CPU number 2 numa_node is 0 (load 0)
Oct 07 18:53:43 irqjammy irqbalance[600]: Interrupt 42 node_num is -1 (ethernet/1:2)
Oct 07 18:53:43 irqjammy irqbalance[600]: Interrupt 47 node_num is -1 (ethernet/1:0)
Oct 07 18:53:43 irqjammy irqbalance[600]: Interrupt 40 node_num is -1 (ethernet/1:0)
--
Oct 07 18:53:43 irqjammy irqbalance[600]: CPU number 1 numa_node is 0 (load 0)
Oct 07 18:53:43 irqjammy irqbalance[600]: Interrupt 46 node_num is -1 (ethernet/1:0)
--
Oct 07 18:53:43 irqjammy irqbalance[600]: Interrupt 41 node_num is -1 (ethernet/89958488:4276)
Remove/add:
Oct 07 18:54:18 irqjammy irqbalance[600]: IRQ 45 is removed from interrupts_db.
Oct 07 18:54:18 irqjammy irqbalance[600]: IRQ 43 is removed from interrupts_db.
Oct 07 18:54:18 irqjammy irqbalance[600]: IRQ 41 is removed from interrupts_db.
Oct 07 18:54:18 irqjammy irqbalance[600]: IRQ 46 is removed from interrupts_db.
Oct 07 18:54:18 irqjammy irqbalance[600]: IRQ 44 is removed from interrupts_db.
Oct 07 18:54:18 irqjammy irqbalance[600]: IRQ 42 is removed from interrupts_db.
Oct 07 18:54:18 irqjammy irqbalance[600]: IRQ 40 is removed from interrupts_db.
Oct 07 18:54:18 irqjammy irqbalance[600]: IRQ 39 is removed from interrupts_db.
Oct 07 18:54:18 irqjammy irqbalance[600]: IRQ 47 is removed from interrupts_db.
--
Oct 07 18:54:18 irqjammy irqbalance[600]: Adding IRQ 39 to database
Oct 07 18:54:18 irqjammy irqbalance[600]: Hotplug dev irq: 39 finished.
Oct 07 18:54:18 irqjammy irqbalance[600]: Adding IRQ 40 to database
Oct 07 18:54:18 irqjammy irqbalance[600]: Hotplug dev irq: 40 finished.
Oct 07 18:54:18 irqjammy irqbalance[600]: Adding IRQ 41 to database
Oct 07 18:54:18 irqjammy irqbalance[600]: Hotplug dev irq: 41 finished.
Oct 07 18:54:18 irqjammy irqbalance[600]: Adding IRQ 42 to database
Oct 07 18:54:18 irqjammy irqbalance[600]: Hotplug dev irq: 42 finished.
Oct 07 18:54:18 irqjammy irqbalance[600]: Adding IRQ 43 to database
Oct 07 18:54:18 irqjammy irqbalance[600]: Hotplug dev irq: 43 finished.
Oct 07 18:54:18 irqjammy irqbalance[600]: Adding IRQ 44 to database
Oct 07 18:54:18 irqjammy irqbalance[600]: Hotplug dev irq: 44 finished.
Oct 07 18:54:18 irqjammy irqbalance[600]: Adding IRQ 45 to database
Oct 07 18:54:18 irqjammy irqbalance[600]: Hotplug dev irq: 45 finished.
Oct 07 18:54:18 irqjammy irqbalance[600]: Adding IRQ 46 to database
Oct 07 18:54:18 irqjammy irqbalance[600]: Hotplug dev irq: 46 finished.
Oct 07 18:54:18 irqjammy irqbalance[600]: Adding IRQ 47 to database
Oct 07 18:54:18 irqjammy irqbalance[600]: Hotplug dev irq: 47 finished.
Go to the same vCPU:
Oct 07 18:54:28 irqjammy irqbalance[600]: CPU number 0 numa_node is 0 (load 418843512)
Oct 07 18:54:28 irqjammy irqbalance[600]: Interrupt 44 node_num is -1 (ethernet/418843512:9096)
--
Oct 07 18:54:28 irqjammy irqbalance[600]: Interrupt 47 node_num is -1 (ethernet/87769773:3627)
Oct 07 18:54:28 irqjammy irqbalance[600]: Interrupt 42 node_num is -1 (ethernet/67224822:2778)
Oct 07 18:54:28 irqjammy irqbalance[600]: Interrupt 46 node_num is -1 (ethernet/66498852:2748)
--
Oct 07 18:54:28 irqjammy irqbalance[600]: Interrupt 45 node_num is -1 (ethernet/19987696:7108)
Oct 07 18:54:28 irqjammy irqbalance[600]: Interrupt 40 node_num is -1 (ethernet/201359879:8321)
But *this time*, they spread:
Oct 07 18:54:53 irqjammy irqbalance[600]: CPU number 3 numa_node is 0 (load 340000000)
Oct 07 18:54:53 irqjammy irqbalance[600]: Interrupt 46 node_num is -1 (ethernet/93848319:9593)
--
Oct 07 18:54:53 irqjammy irqbalance[600]: CPU number 2 numa_node is 0 (load 200000000)
Oct 07 18:54:53 irqjammy irqbalance[600]: Interrupt 42 node_num is -1 (ethernet/199997802:6882)
--
Oct 07 18:54:53 irqjammy irqbalance[600]: CPU number 1 numa_node is 0 (load 260058698)
Oct 07 18:54:53 irqjammy irqbalance[600]: Interrupt 40 node_num is -1 (ethernet/259994880:6912)
Oct 07 18:54:53 irqjammy irqbalance[600]: CPU number 0 numa_node is 0 (load 276702268)
Oct 07 18:54:53 irqjammy irqbalance[600]: Interrupt 44 node_num is -1 (ethernet/219996952:6839)
And remain spread until end of test/load:
Oct 07 18:55:13 irqjammy irqbalance[600]: CPU number 3 numa_node is 0 (load 60000000)
Oct 07 18:55:13 irqjammy irqbalance[600]: Interrupt 46 node_num is -1 (ethernet/20134664:1892)
--
Oct 07 18:55:13 irqjammy irqbalance[600]: CPU number 2 numa_node is 0 (load 50000000)
Oct 07 18:55:13 irqjammy irqbalance[600]: Interrupt 42 node_num is -1 (ethernet/49852197:1359)
--
Oct 07 18:55:18 irqjammy irqbalance[600]: CPU number 1 numa_node is 0 (load 60000000)
Oct 07 18:55:18 irqjammy irqbalance[600]: Interrupt 40 node_num is -1 (ethernet/59999802:1354)
Oct 07 18:55:18 irqjammy irqbalance[600]: CPU number 0 numa_node is 0 (load 40146732)
Oct 07 18:55:18 irqjammy irqbalance[600]: Interrupt 44 node_num is -1 (ethernet/21885696:1288)
--
Oct 07 18:55:18 irqjammy irqbalance[600]: CPU number 3 numa_node is 0 (load 0)
Oct 07 18:55:18 irqjammy irqbalance[600]: Interrupt 46 node_num is -1 (ethernet/1:0)
--
Oct 07 18:55:18 irqjammy irqbalance[600]: CPU number 2 numa_node is 0 (load 0)
Oct 07 18:55:18 irqjammy irqbalance[600]: Interrupt 42 node_num is -1 (ethernet/1:2)
--
Oct 07 18:55:18 irqjammy irqbalance[600]: CPU number 1 numa_node is 0 (load 0)
Oct 07 18:55:18 irqjammy irqbalance[600]: Interrupt 40 node_num is -1 (ethernet/1:0)
Oct 07 18:55:18 irqjammy irqbalance[600]: CPU number 0 numa_node is 0 (load 0)
Oct 07 18:55:18 irqjammy irqbalance[600]: Interrupt 44 node_num is -1 (ethernet/1:0)
Now there _are_ rebalancing events in the log during the test:
$ journalctl -b -u irqbalance.service | grep 'Selecting irq \(40\|42\|44\|46\) for rebalancing'
...
Oct 07 18:54:23 irqjammy irqbalance[600]: Selecting irq 44 for rebalancing
Oct 07 18:54:28 irqjammy irqbalance[600]: Selecting irq 40 for rebalancing
Oct 07 18:54:43 irqjammy irqbalance[600]: Selecting irq 46 for rebalancing
Oct 07 18:55:43 irqjammy irqbalance[600]: Selecting irq 46 for rebalancing
...
Which confirms the source code is now entering the if, nt the else:
76 static void move_candidate_irqs(struct irq_info *info, void *data)
...
100 if ((lb_info->min_load + info->load) < delta_load + (lb_info->adjustment_load - info->load)) {
...
106 } else
107 return;
108
109 log(TO_CONSOLE, LOG_INFO, "Selecting irq %d for rebalancing\n", info->irq);