Too long names for bridges

Bug #1672327 reported by Ante Karamatić
34
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Critical
Witold Krecicki
MAAS
Fix Released
Critical
Mike Pontillo

Bug Description

I have a set up in MAAS which defines 4 ehternet devices:
 - ens255f0 - single untagged vlan, single space (172.16.7.0/24)
 - ens255f1 - two vlans (2910, 2912) (172.16.10.0/24, 172.16.12.0/24)
 - bond0 - bond of two devices (ens3f0 and ens3f1) (172.16.11.0/24)

When deployed with MAAS, all interfaces come up cleanly. For instance, one can see that from cloud-init-output.log:

Cloud-init v. 0.7.5 running 'init' at Fri, 10 Mar 2017 14:46:15 +0000. Up 8.55 seconds.
ci-info: ++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++
ci-info: +---------------+------+---------------+---------------+-------------------+
ci-info: | Device | Up | Address | Mask | Hw-Address |
ci-info: +---------------+------+---------------+---------------+-------------------+
ci-info: | bond0 | True | . | . | 2c:60:0c:84:32:a5 |
ci-info: | lo | True | 127.0.0.1 | 255.0.0.0 | . |
ci-info: | ens255f1.2912 | True | 172.16.12.252 | 255.255.255.0 | 2c:60:0c:f9:74:bc |
ci-info: | ens255f1.2910 | True | 172.16.10.252 | 255.255.255.0 | 2c:60:0c:f9:74:bc |
ci-info: | bond0.2911 | True | 172.16.11.252 | 255.255.255.0 | 2c:60:0c:84:32:a5 |
ci-info: | ens3f1 | True | . | . | 2c:60:0c:84:32:a5 |
ci-info: | ens3f0 | True | . | . | 2c:60:0c:84:32:a5 |
ci-info: | ens255f0 | True | 172.16.7.106 | 255.255.255.0 | 2c:60:0c:f9:74:bb |
ci-info: | ens255f1 | True | . | . | 2c:60:0c:f9:74:bc |
ci-info: +---------------+------+---------------+---------------+-------------------+
ci-info: +++++++++++++++++++++++++++++++++Route info+++++++++++++++++++++++++++++++++
ci-info: +-------+-------------+------------+---------------+---------------+-------+
ci-info: | Route | Destination | Gateway | Genmask | Interface | Flags |
ci-info: +-------+-------------+------------+---------------+---------------+-------+
ci-info: | 0 | 0.0.0.0 | 172.16.7.1 | 0.0.0.0 | ens255f0 | UG |
ci-info: | 1 | 172.16.7.0 | 0.0.0.0 | 255.255.255.0 | ens255f0 | U |
ci-info: | 2 | 172.16.10.0 | 0.0.0.0 | 255.255.255.0 | ens255f1.2910 | U |
ci-info: | 3 | 172.16.11.0 | 0.0.0.0 | 255.255.255.0 | bond0.2911 | U |
ci-info: | 4 | 172.16.12.0 | 0.0.0.0 | 255.255.255.0 | ens255f1.2912 | U |
ci-info: +-------+-------------+------------+---------------+---------------+-------+

When deploying container, which is connected to multiple spaces, connected to these three intercaces, juju rearranges the /etc/network/interfaces and it does look correct, but it fails to bring up the interfaces. Juju log also says:

2017-03-10 14:49:48 ERROR juju.provisioner provisioner_task.go:707 cannot start instance for machine "0/lxd/1": unable to setup network: host machine "0" has no available device in space(s) "ceph-access-space"
2017-03-10 14:54:21 ERROR juju.provisioner provisioner_task.go:707 cannot start instance for machine "0/lxd/0": unable to find host bridge for space(s) "internal-space" for container "0/lxd/0"

Related branches

Revision history for this message
Ante Karamatić (ivoks) wrote :
Revision history for this message
Ante Karamatić (ivoks) wrote :

What is interesting is that it fails on every machine with the same setup. But it works on one other machine. This machine however, has different NIC setup. ens3f* and ens255f* are switched, so bond is created from ens255f0 and ens255f1, while VLANs are on ens3f1, and untagged vlan is on ens3f0.

Revision history for this message
Ante Karamatić (ivoks) wrote :

root@pure-aphid:~# ifup -a
device br-ens255f1.2910 already exists; can't create bridge with the same name
run-parts: /etc/network/if-pre-up.d/bridge exited with return code 1
RTNETLINK answers: Numerical result out of range
Failed to bring up br-ens255f1.2910.
device br-ens255f1.2912 already exists; can't create bridge with the same name
run-parts: /etc/network/if-pre-up.d/bridge exited with return code 1
RTNETLINK answers: Numerical result out of range
Failed to bring up br-ens255f1.2912.

Revision history for this message
Ante Karamatić (ivoks) wrote :

OK. Root couse is identified. ip only support 15 chars long filenames. Which means that juju needs to create bridges with names up to 15 chars or we need to fix kernel.

Once devices are renamed to drop 'ens':

root@pure-aphid:~# ifup -a

Waiting for br-255f1.2910 to get ready (MAXWAIT is 32 seconds).
device ens255f1.2912 is already a member of a bridge; can't enslave it to bridge br-255f1.2912.

Waiting for br-255f1.2912 to get ready (MAXWAIT is 32 seconds).

summary: - Juju not bringing all bridges
+ Too long names for bridges
Revision history for this message
Ante Karamatić (ivoks) wrote :

MAAS also allows creating bridges that will have labels longer than 15 chars.

Revision history for this message
Nobuto Murata (nobuto) wrote :

Agreed that it would be nice if MAAS warned users to notify about too long interface names.

Related bug/discussion:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1567744

Revision history for this message
Ante Karamatić (ivoks) wrote :

Indeed, this is a duplicate of that bug. I'll let juju and maas devs to decide how to handle this bug.

Revision history for this message
Anastasia (anastasia-macmood) wrote :

Juju should try to create bridges' names with length < 15 regardless of whether the kernel is fixed.

Revision history for this message
Anastasia (anastasia-macmood) wrote :

Triaging as High as it's becoming a common pain point.

Changed in juju:
status: New → Triaged
importance: Undecided → High
Changed in maas:
status: New → Triaged
Revision history for this message
Andres Rodriguez (andreserl) wrote :

FWIW, this is not a kernel fix. This is a systemd fix or teach the ubuntu tools to handle names longer than 15 characters.

MAAS cannot set a minimum character number to interfaces provided that if there is an interface with more than 15 characters, MAAS would fail to discover such interface. The bug in systemd doesn't say whether the fix will be limiting the 15 characters, or allowing the tools to handle interfaces with more than 15 characters.

That said, systemd does support having 15+ characters. So, it is not sensible to work around a bug somewhere else in MAAS and Juju provided that MAAS & Juju could break if we start working just fine with 15+ character NIC names in the system itself.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

fwiw, MAAS has already run into similar issues with character limitation, as such, character limitation in MAAS is not ideal.

Changed in maas:
milestone: none → 2.2.0
Revision history for this message
Ante Karamatić (ivoks) wrote :

I don't think it's a systemd issue, since I reproduced it on 14.04. Length of interface name is defined in kernel headers.

Revision history for this message
Christian Reis (kiko) wrote :

For reference, there's a good answer here: http://stackoverflow.com/questions/24932172/what-length-can-a-network-interface-name-have

Analogous issue in Neutron: https://bugs.launchpad.net/neutron/+bug/1328288

I'm pretty sure we need to be careful not to generate bridges that are too long. Does LXD also not create bridges, and do we need to watch out for that?

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :
Ryan Beisner (1chb1n)
tags: added: maas-provider uosci
Revision history for this message
John A Meinel (jameinel) wrote :

https://bugs.launchpad.net/neutron/+bug/1328288 has a link to https://git.openstack.org/cgit/openstack/neutron/commit/?id=ca7ed8f84da6962a9c2b8ad21d484931d297f31b where they fixed it by taking 6 characters of a hash of the name, and then appending that to as much of the name that they can fit into 15 characters. That seems a reasonable workaround for us as well.

John A Meinel (jameinel)
Changed in juju:
assignee: nobody → Witold Krecicki (wpk)
Chris Gregan (cgregan)
tags: added: cdo-qa-blocker
Revision history for this message
Mike Pontillo (mpontillo) wrote :

For the record, this /is/ a kernel limitation, for the moment. Even if systemd (et al.) supports names longer than 15 characters, the kernel will need to change in order to support that.

In 'include/uapi/linux/if.h' IFNAMSIZ is defined to be 16 (15 characters plus a \0 terminator)[1]. In 'include/linux/netdevice.h', `struct net_device` is defined, and IFNAMSIZ is used as the maximum length of the `name` field[2].

[1]:
http://lxr.free-electrons.com/source/include/uapi/linux/if.h?v=4.4#L26

[2]:
http://lxr.free-electrons.com/source/include/linux/netdevice.h?v=4.4#L1538

Changed in juju:
milestone: none → 2.2-beta2
Changed in juju:
milestone: 2.2-beta2 → 2.2-beta3
Witold Krecicki (wpk)
Changed in juju:
status: Triaged → In Progress
James Page (james-page)
tags: added: openstack-provider
tags: added: openstack
removed: openstack-provider
Revision history for this message
Witold Krecicki (wpk) wrote :
Changed in maas:
milestone: 2.2.0 → 2.2.0rc3
importance: Undecided → Critical
Changed in maas:
assignee: nobody → Mike Pontillo (mpontillo)
Revision history for this message
Mike Pontillo (mpontillo) wrote :

I plan to change MAAS to implement the same default interface naming as seen in the Juju pull request for this issue[1].

[1]:
https://github.com/juju/juju/pull/7204

Changed in maas:
status: Triaged → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
Changed in juju:
importance: High → Critical
milestone: 2.2-beta3 → 2.2-beta4
Witold Krecicki (wpk)
Changed in juju:
status: In Progress → Fix Committed
Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

In case anybody is looking for a manual verification:

https://paste.ubuntu.com/24934816/

Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.