Delay sending data after wireless roaming between APs

Bug #1944426 reported by Mjohnson459
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-signed-hwe-5.11 (Ubuntu)
New
Undecided
Unassigned

Bug Description

## Summary
Any time I roam from one access point to another (same SSID) I start receiving packets instantly after the roam is successful but experience a delay of roughly 1 second before I can send packets out. I have seen this with
multiple configurations listed below. I started seeing this delay after upgrading from Ubuntu 16.04 to 18.04 and using the debians from https://kernel.ubuntu.com/~kernel-ppa/mainline/ I can see the bug was introduced between v4.19 and v4.20-rc1.

## Configurations
I have tested multiple laptops and machines with different software to narrow down where the bug might be. All of these configurations showed the bug.
- Distro(kernel) version - 20.04 (5.4, 5.8, 5.11, 5.14-rc7), Kali 2021.2 (5.10)
- Hardware(driver): intel (iwlwifi), qualcomm (ath10k), realtek.
- Supplicant: iwd and wpa_supplicant
- Manager: iwd, systemd-networkd, NetworkManager
- Data: ping, iperf3 (tcp and udp), custom python udp script
- APs:  Meraki MR46, tp-link decos

## Reproduction
The bug is relatively easy to reproduce if you have multiple APs on the same SSID available. The simplest way is to start up a `ping -i 0.2 <any ip>` and then walk between the APs until it roams. It is also possible to use `wpa_cli roam <BSSID>`. For more accurate testing I was using a simple python script that spams udp packets and then using `tcpdump` to record the packets for viewing in wireshark.

## Example output
Here is the output of the simplest test that still shows the issue
(ping + tcpdump + iwd + 5.11.0-27-generic):
[https://pastebin.com/92TKKktb](https://pastebin.com/92TKKktb)

My naive tl;dr of that data is:
- 30.322638 - we start to roam which falls between icmp_seq=121 and icmp_seq=122.
- 30.415411 - roam is complete
- 30.424277 - iwd is sending and receiving neighbor reports over the link
- 31.358491 - an ARP request is sent out (why is the ARP cache cleared?)
- 31.367930 - ARP response - packets start being sent again

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: linux-image-5.11.0-34-generic 5.11.0-34.36~20.04.1
ProcVersionSignature: Ubuntu 5.11.0-34.36~20.04.1-generic 5.11.22
Uname: Linux 5.11.0-34-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.20.11-0ubuntu27.20
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
Date: Tue Sep 21 10:12:40 2021
InstallationDate: Installed on 2019-08-09 (773 days ago)
InstallationMedia: Ubuntu 18.04.2 LTS "Bionic Beaver" - Release amd64 (20190210)
SourcePackage: linux-signed-hwe-5.11
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Mjohnson459 (mjohnson459) wrote :
Revision history for this message
Mjohnson459 (mjohnson459) wrote :

The patch proposed and accepted here is a partial fix for this bug.
https://marc.info/?l=linux-netdev&m=163579523718765

It adds an option for the wifi supplicant to choose not to clear the ARP cache when roaming.

https://github.com/torvalds/linux/commit/fcdb44d08a95003c3d040aecdee286156ec6f34e
https://github.com/torvalds/linux/commit/18ac597af25e9760b76471524096f5b29eb820e6
https://github.com/torvalds/linux/commit/f86ca07eb5310a1bdc7032458c7f76862f5a1552

The iwd supplicant adds a patch here to use this which is the other half of the fix:

commit 873924a027ad2166436b8117a6bb84ce980ad7f3
Author: James Prestwood <email address hidden>
Date: Wed Nov 3 15:15:01 2021 -0700

    station: set evict_nocarrier sysfs option during roaming

    If the kernel supports evict_nocarrier set this during the roam
    to prevent packet delays post roam.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.