aws: network performance regression due to initial TCP receive buffer size change
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| linux-aws (Ubuntu) |
Undecided
|
Unassigned | ||
| Bionic |
High
|
Unassigned | ||
| Focal |
High
|
Unassigned | ||
| Groovy |
High
|
Unassigned |
Bug Description
[Impact]
AWS has seen some customers reporting networking performance degradation after they upgraded their Ubuntu instanceses. This regression is highly impacting customers who are using MTU=9000 (which is the default in EC2).
[Test case]
Bug reproduced internally in AWS (no test case provided), but apparently it is very easy to reproduce simply by measuring networking performance.
[Fix]
AWS worked internally and found that the regression has been introduced by:
a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
To solve the problem we need to apply the following upstream commit that explicitly fixes the problem introduced by the commit above:
33ae7b5bb841 ("tcp: select sane initial rcvq_space.space for big MSS")
[Regression potential]
Upstream fix that is only affecting the initial TCP buffer space and allows the TCP window size to be dynamically increased, basically restoring the previous (correct) behavior, so regression potential is minimal.
CVE References
Changed in linux-aws (Ubuntu Bionic): | |
importance: | Undecided → High |
Changed in linux-aws (Ubuntu Focal): | |
importance: | Undecided → High |
Changed in linux-aws (Ubuntu Groovy): | |
importance: | Undecided → High |
Changed in linux-aws (Ubuntu Bionic): | |
status: | New → Fix Committed |
Changed in linux-aws (Ubuntu Focal): | |
status: | New → Fix Committed |
Changed in linux-aws (Ubuntu Groovy): | |
status: | New → Fix Committed |
Launchpad Janitor (janitor) wrote : | #1 |
Changed in linux-aws (Ubuntu Bionic): | |
status: | Fix Committed → Fix Released |
Launchpad Janitor (janitor) wrote : | #2 |
This bug was fixed in the package linux-aws - 5.4.0-1037.39
---------------
linux-aws (5.4.0-1037.39) focal; urgency=medium
* focal/linux-aws: 5.4.0-1037.39 -proposed tracker (LP: #1911314)
* aws: network performance regression due to initial TCP receive buffer size
change (LP: #1910200)
- tcp: select sane initial rcvq_space.space for big MSS
* Disable Atari partition support for linux-aws (LP: #1908264)
- [Config] Disable Atari partition support
* aws: xen-netfront: prevent potential error on hibernate (LP: #1906850)
- SAUCE: xen-netfront: prevent unnecessary close on hibernate
[ Ubuntu: 5.4.0-63.71 ]
* focal/linux: 5.4.0-63.71 -proposed tracker (LP: #1911333)
* overlay: permission regression in 5.4.0-51.56 due to patches related to
CVE-2020-16120 (LP: #1900141)
- ovl: do not fail because of O_NOATIME
* Focal update: v5.4.79 upstream stable release (LP: #1907151)
- net/mlx5: Use async EQ setup cleanup helpers for multiple EQs
- net/mlx5: poll cmd EQ in case of command timeout
- net/mlx5: Fix a race when moving command interface to events mode
- net/mlx5: Add retry mechanism to the command entry index allocation
* Kernel 5.4.0-56 Wi-Fi does not connect (LP: #1906770)
- mt76: fix fix ampdu locking
* [Ubuntu 21.04 FEAT] mpt3sas: Request to include the patch set which supports
topology where zoning is enabled in expander (LP: #1899802)
- scsi: mpt3sas: Define hba_port structure
- scsi: mpt3sas: Allocate memory for hba_port objects
- scsi: mpt3sas: Rearrange _scsih_
- scsi: mpt3sas: Update hba_port's sas_address & phy_mask
- scsi: mpt3sas: Get device objects using sas_address & portID
- scsi: mpt3sas: Rename transport_
- scsi: mpt3sas: Get sas_device objects using device's rphy
- scsi: mpt3sas: Update hba_port objects after host reset
- scsi: mpt3sas: Set valid PhysicalPort in SMPPassThrough
- scsi: mpt3sas: Handling HBA vSES device
- scsi: mpt3sas: Add bypass_
- scsi: mpt3sas: Handle vSES vphy object during HBA reset
- scsi: mpt3sas: Add module parameter multipath_on_hba
- scsi: mpt3sas: Bump driver version to 35.101.00.00
[ Ubuntu: 5.4.0-62.70 ]
* focal/linux: 5.4.0-62.70 -proposed tracker (LP: #1911144)
* CVE-2020-28374
- SAUCE: target: fix XCOPY NAA identifier lookup
* Packaging resync (LP: #1786013)
- update dkms package versions
-- Kelsey Skunberg <email address hidden> Wed, 13 Jan 2021 19:01:10 -0700
Changed in linux-aws (Ubuntu Focal): | |
status: | Fix Committed → Fix Released |
This bug was fixed in the package linux-aws - 4.15.0-1093.99
---------------
linux-aws (4.15.0-1093.99) bionic; urgency=medium
* bionic/linux-aws: 4.15.0-1093.99 -proposed tracker (LP: #1911275)
* aws: network performance regression due to initial TCP receive buffer size
change (LP: #1910200)
- tcp: select sane initial rcvq_space.space for big MSS
* arm64: prevent losing page dirty state (LP: #1908503)
- arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect()
* Disable Atari partition support for cloud kernels (LP: #1908264)
- [Config] Disable Atari partition support
* aws: xen-netfront: prevent potential error on hibernate (LP: #1906850)
- SAUCE: xen-netfront: prevent unnecessary close on hibernate
[ Ubuntu: 4.15.0-133.137 ]
* bionic/linux: 4.15.0-133.137 -proposed tracker (LP: #1911295) monitors_ config( ) kvm_unit_ tests interrupted on X-oracle-4.15 / PPC_RTAS_ FILTER HIERARCHY ext_bus_ get_link( ) echo_skb( ): fix real payload length return value for RTR echo_skb( ): fix echo skb generation: always use skb_clone() get_ts_ time(): fix timestamp wrapping can_rx( ): fix echo management when loopback is
* [drm:qxl_enc_commit [qxl]] *ERROR* head number too large or missing monitors
config: (LP: #1908219)
- qxl: remove qxl_io_log()
- qxl: move qxl_send_
- qxl: hook monitors_config updates into crtc, not encoder.
* Touchpad not detected on ByteSpeed C15B laptop (LP: #1906128)
- Input: i8042 - add ByteSpeed touchpad to noloop table
* vmx_nm_test in ubuntu_
B-oracle-4.15 / X-KVM / B-KVM (LP: #1872401)
- KVM: nVMX: Always reflect #NM VM-exits to L1
* stack trace in kernel (LP: #1903596)
- net: napi: remove useless stack trace
* CVE-2020-27777
- [Config]: Set CONFIG_
* Bionic update: upstream stable patchset 2020-12-04 (LP: #1906875)
- regulator: defer probe when trying to get voltage from unresolved supply
- ring-buffer: Fix recursion protection transitions between interrupt context
- time: Prevent undefined behaviour in timespec64_to_ns()
- nbd: don't update block size after device is started
- btrfs: sysfs: init devices outside of the chunk_mutex
- btrfs: reschedule when cloning lots of extents
- genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_
- hv_balloon: disable warning when floor reached
- net: xfrm: fix a race condition during allocing spi
- perf tools: Add missing swap for ino_generation
- ALSA: hda: prevent undefined shift in snd_hdac_
- can: rx-offload: don't call kfree_skb() from IRQ context
- can: dev: can_get_echo_skb(): prevent call to kfree_skb() in hard IRQ
context
- can: dev: __can_get_
frames
- can: can_create_
- can: peak_usb: add range checking in decode operations
- can: peak_usb: peak_usb_
- can: peak_canfd: pucan_handle_
on
- xfs: flush new eof page on truncate to avoid post-eof corruption
- Btrfs: fix missing error return if writeback for extent buffer never started
- ath9k_htc: Use appropriate rs_datalen type
- usb: gadget: goku_udc: fix potential crashes in probe
- gfs2: Free rd_bits later in gfs2_clear_rgrpd to fix use-after-free
...