[cosmic] ipoib ping with large message size failed

Bug #1801574 reported by Talat Batheesh
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Tyler Hicks

Bug Description

We see that ping over ipoib interface stuck with large packets, this is a new degradation, this test pass on Ubuntu 18.04.
After investigating the issue, we see that commit [1] introduce the issue, and it is not an upstream commit, it is canonical commit.
Could you please check with the canonical kernel team why they revert that commit?
To reproduce the bug, please use ConnectX-3 devices with ipoib connection with 2044 MTU (default) and run command [2].
Is there open Launchpad on it?

[1]
commit 77a24c313d21e3765b04d90521e9228a9bb6e332
Author: Tyler Hicks <email address hidden>
Date: Fri Aug 3 21:53:15 2018 +0000

    Revert "net: increase fragment memory usage limits"

    This reverts commit c2a936600f78aea00d3312ea4b66a79a4619f9b4. It
    made denial of service attacks on the IP fragment handling easier to
    carry out.

    CVE-2018-5391

    Signed-off-by: Tyler Hicks <email address hidden>
    Signed-off-by: Stefan Bader <email address hidden>

[2]
ping 13.194.22.1 -I 13.194.23.1 -s 65507

Tags: cosmic cscc
summary: - [bionic] ipoib ping with large message size failed
+ [cosmic] ipoib ping with large message size failed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1801574

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: cosmic
Luke Williams (wililupy)
Changed in linux (Ubuntu):
importance: Undecided → High
assignee: nobody → Tyler Hicks (tyhicks)
Luke Williams (wililupy)
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Jeff Lane  (bladernr) wrote :

Hi Talat,

Something else occurs to me, could you list which kernels are affected? Is this correct so far:

4.15: Not affected
4.18: Affected
How about 5.0 and perhaps 4.4 (not sure if 4.4 includes support for this though, so that may be not worth investigating.)

And for failing kernels, could you provide the full version (per uname)? that helps too.

Revision history for this message
Jeff Lane  (bladernr) wrote :

Err... nevermind 4.4 anyway. just 5.0 would be interesting to see if this persisted into Disco.

Revision history for this message
Tyler Hicks (tyhicks) wrote :

Hello Talat - You said that this test passes on the 18.04 LTS kernel so it is worth nothing that the commit you identified is also present in that kernel. It was released in 4.15.0-32.34:

  https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/commit/?id=77a24c313d21e3765b04d90521e9228a9bb6e332

All that patch does is change the defaults of the following two sysctls:

  /proc/sys/net/ipv4/ipfrag_high_thresh
  /proc/sys/net/ipv4/ipfrag_low_thresh

To temporarily go back to the old defaults (you'll lose these changes on reboot), you can run the following commands (be sure to run them in this order):

$ echo 4194304 | sudo tee /proc/sys/net/ipv4/ipfrag_high_thresh
$ echo 3145728 | sudo tee /proc/sys/net/ipv4/ipfrag_low_thresh

Revision history for this message
Talat Batheesh (talat-b87) wrote :

Hi Tyler,

We did a kernel bisecting and we fount that the mentioned commit is the root cause of this big, if bionic contain this commit, so the bug also in bionic release.

yours,
Talat

Revision history for this message
Jeff Lane  (bladernr) wrote :

Hi Tyler, do you have any update for this bug?

Revision history for this message
Tyler Hicks (tyhicks) wrote :

Hello - Talat, could you please try the workaround that I mentioned in comment #4? It should be all you need to get back to the old threshold values.

Revision history for this message
Talat Batheesh (talat-b87) wrote :

Hi
Sure i will try it and update.

Revision history for this message
Talat Batheesh (talat-b87) wrote :

Hi,
After trying the steps that mentioned in #4 on 18.10, the issue doesn't reproduces with the suggested setting.

yours,
Talat

Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.