[2.1, 2.2] MTU of parent device is sometimes set incorrectly

Bug #1662948 reported by Alexandre Ceriani on 2017-02-08
38
This bug affects 4 people
Affects Status Importance Assigned to Milestone
MAAS
High
Mike Pontillo
2.1
Medium
Unassigned
curtin
Undecided
Unassigned

Bug Description

With bonding, when setting an MTU of 1500 on bond1.10 and an MTU on 9000 on bond1.20, the result vary. Some times it is set correctly and some times both are at 1500. N.B. bond1 is not configured, there are no networks on bond1.

It seems possibly related to the following bug : https://bugs.launchpad.net/debian/+source/vlan/+bug/1224007
Which seem to imply that raw device MTU should be set first.

Tried to set the lowest VLAN(10) at 9000 and others(20) to 1500 but the issue is still seen.

This was seen in an OpenStack deployment with Juju as well as a manual deployment from MAAS GUI.

It is always possible to fix manually :
"ifconfig bond1 mtu 9000"
"ifconfig bond1.20 mtu 9000"

The deployed image is Ubuntu 16.04 with kernel 4.4.0-62-generic
The version of the package "vlan" is 1.9-3.2

Tags: sts Edit Tag help

Related branches

Alexandre Ceriani (aceriani) wrote :
Blake Rouse (blake-rouse) wrote :

Thanks for attaching all your logs. In this case we need more information about the deploying node.

Can you please attach the output of:

maas machine get-curtin-config {system_id}

Along with the output of the deployed network configuration:

cat /etc/network/interfaces

Changed in maas:
status: New → Incomplete
Alexandre Ceriani (aceriani) wrote :

Here are your requested files, I think you are on something with this, rapidly I checked and it seems MAAS report the correct MTU config.
I joined the files for a working and a non-working node.

tags: added: sts
Alexandre Ceriani (aceriani) wrote :

Joined sosreport after a new deployment. Will join MAAS GUI captures afterwards.

Alexandre Ceriani (aceriani) wrote :

Joined some captures of the configuration, let me know if you want more information.
I only used the GUI for this deployment but I can do the same in the CLI if you think it might be any different.

Alexandre Ceriani (aceriani) wrote :

Also, sorry for the confusion but files I uploaded are with VLAN 10 set at 9000 and the VLAN 20 set at 1500.

So I would expect to find bond1 and bond1.10 with MTU of 9000 and bond1.20 with an MTU of 1500.

Blake Rouse (blake-rouse) wrote :

I think this might be caused in the order in which you are setting the MTU. That would be an issue with MAAS if its updating the parent incorrectly.

Can you provide the exact order of the API calls that you are performing? You also say that one node was correct and the other was incorrect. What did you do differently to get one correct and the other incorrect?

Alexandre Ceriani (aceriani) wrote :

Hi Blake,
I did everything through the GUI, not through the API so I would expect the calls to be always the same.
Joined the configuration related to MTU in "MAAS_GUI_captures.zip".
I did multiple deployment on 5 nodes at a time.
I tried with "default" kernel and "ga", always with Ubuntu 16.04.
I am running MAAS 2.1.3 on ubuntu 16.04.

The number of nodes were the MTU is not set correctly change from a deployment to another, I even had a deployment were all 5 nodes were set correctly. I do absolutely nothing different. I select the nodes, I release them with a quick erase of the drives (will try a full erase in case) and redeploy. The result is inconsistent between deployments and I am not editing any parameter in the nodes.

Alexandre Ceriani (aceriani) wrote :

It seems that rebooting the nodes solve the issue, I highly suspect the bug is only on the first boot.
I rebooted the 5 nodes 3 times, the one in error became correct and none was seen with the wrong MTU.

Blake Rouse (blake-rouse) wrote :

Adding the curtin team to see if they can make since out of why it works after reboot.

Changed in maas:
status: Incomplete → Confirmed
Ryan Harper (raharper) wrote :

The incoming configs match the eni rendered in both the "correct" and "bad" configuration files; that is the input is generating expected output.

If the desire is to ensure that bond1 has an mtu of 9000, then the underlying bond1 config should have mtu: 9000 , rather than mtu 1500.

Note that curtin injects ifupdown hook scripts to ensure that the underlying devices are configured with a higher mtu if other stacked interfaces need them; however, in your configuration you're mixing two different mtus on the same device; it's always going result in mtu being set to the last vlan up.

Even if things appear stable after multiple reboots there's no guarantee that the next reboot wont result in a different configuration.

I believe this is an invalid configuration.

On Tue, Feb 14, 2017 at 10:25 AM, Ryan Harper <email address hidden>
wrote:

> The incoming configs match the eni rendered in both the "correct" and
> "bad" configuration files; that is the input is generating expected
> output.
>
> If the desire is to ensure that bond1 has an mtu of 9000, then the
> underlying bond1 config should have mtu: 9000 , rather than mtu 1500.
>
> Note that curtin injects ifupdown hook scripts to ensure that the
> underlying devices are configured with a higher mtu if other stacked
> interfaces need them; however, in your configuration you're mixing two
> different mtus on the same device; it's always going result in mtu being
> set to the last vlan up.
>
> Even if things appear stable after multiple reboots there's no guarantee
> that the next reboot wont result in a different configuration.
>
> I believe this is an invalid configuration.
>

Let me clarify, the configuration generated results in unpredictable results
since the underlying devices do not have their MTU configured.

The parent devices should have an MTU >= the MTU setting of subinterfaces.

In the example config, we should have mtu: 9000 on enp3s0f0, enp3s0f1,
and bond1

This allows bond1.10 of mtu 9000 to function, and bond1.20 with mtu of 1500
to send smaller packets.

That should result in the correct configuration on first-boot and
subsequent boots.

>
> --
> You received this bug notification because you are subscribed to curtin.
> Matching subscriptions: curtin-bugs-all
> https://bugs.launchpad.net/bugs/1662948
>
> Title:
> MTU is not set correctly on VLAN if raw device is unconfigured
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1662948/+subscriptions
>

Which is a bug in MAAS since the good configuration of the MTU on the parent device is not guaranteed?

My understanding is that I can't configure correctly from the GUI.
If it is possible to configure correctly through the CLI or directly in a configuration file I believe the same should be done through the GUI. Am I wrong?

summary: - MTU is not set correctly on VLAN if raw device is unconfigured
+ [2.1, 2.2] MTU of parent device is sometimes set incorrectly
Changed in maas:
importance: Undecided → Medium
milestone: none → 2.2.0
Bryan Quigley (bryanquigley) wrote :

Could this be worked around with fabric specifying the MTU to 9000 for the default fabric? (And does that not work if you have multiple capabilities (1500 and 9000) in the setup?

IIUC the curtin bug can be closed?

Changed in maas:
importance: Medium → High
status: Confirmed → Triaged
assignee: nobody → Mike Pontillo (mpontillo)
status: Triaged → In Progress
Mike Pontillo (mpontillo) wrote :

@bryanquigley, it would be nice if this validation could be done in curtin, since it's a bit expensive to do it in MAAS. But I've proposed a fix to MAAS 2.2 anyway, which will make sure the effective MTU of any parent interface considers all of its children, and uses the maximum MTU for a child interface.

Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
milestone: 2.2.0 → 2.2.0rc1
Changed in maas:
status: Fix Committed → Fix Released
Bryan Quigley (bryanquigley) wrote :

Thanks! I'm going to close the curtin task as I believe this is fixed completely in MAAS. Please do correct me if I'm wrong.

Changed in curtin:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers