MAAS non-deterministic private-address with non-ethN interfaces
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
juju-core |
Fix Released
|
Medium
|
Unassigned |
Bug Description
On a MAAS setup, we have the following network setup[0] on some deployed units, in this example a nova-compute instance with a bonded interface. eth0-3 are part of bond0, and bond0 carries the IP specified by the hostname (the hostname which is returned by "unit-get public-address"). But "unit-get private-address" returns 10.0.3.1, which is the default bridge set up by lxc-net (and unusable), which appears to be installed by default during preseed.
The order seems to be, as returned by the machine agent log:
2014-04-30 02:10:14 INFO juju.worker.
It appears to be using the network index order and picking the first "local-cloud" address in that order.
I think (but am not 100% positive) this problem began with a juju-upgrade from 1.16.4 to 1.18.1.
[0]
1: lo: <LOOPBACK,
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,
link/ether 9c:8e:99:fb:00:e6 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,
link/ether 9c:8e:99:fb:00:e6 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,
link/ether 9c:8e:99:fb:00:ea brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,
link/ether 9c:8e:99:fb:00:ec brd ff:ff:ff:ff:ff:ff
7: lxcbr0: <BROADCAST,
link/ether a6:7e:62:e1:c3:02 brd ff:ff:ff:ff:ff:ff
inet 10.0.3.1/24 brd 10.0.3.255 scope global lxcbr0
inet6 fe80::a47e:
valid_lft forever preferred_lft forever
9: virbr0: <NO-CARRIER,
link/ether f2:71:d5:0a:e7:5c brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
11: ovs-system: <BROADCAST,
link/ether 42:f6:63:b1:18:28 brd ff:ff:ff:ff:ff:ff
12: br-int: <BROADCAST,
link/ether fe:9a:21:47:94:43 brd ff:ff:ff:ff:ff:ff
inet6 fe80::3c1c:
valid_lft forever preferred_lft forever
16: bond0: <BROADCAST,
link/ether 9c:8e:99:fb:00:e6 brd ff:ff:ff:ff:ff:ff
inet 10.34.6.3/21 brd 10.34.7.255 scope global bond0
inet6 fe80::9e8e:
valid_lft forever preferred_lft forever
[cut]
tags: | added: addressability maas-provider |
Changed in juju-core: | |
status: | New → Triaged |
importance: | Undecided → High |
tags: | added: canonical-is |
tags: | added: production |
Changed in juju-core: | |
importance: | High → Medium |
tags: | added: network |
Changed in juju-core: | |
status: | Triaged → Fix Released |
It appears that this bug takes some time or certain conditions to manifest itself. I don't know what it is but in my environment it started affecting some hosts which had been provisioned many weeks ago while some hosts added recently are not affected (so far).
In my case this bug has significant impact. I am running neutron in OpenStack and the bug caused /etc/neutron/ plugins/ openvswitch/ ovs_neutron_ plugin. ini to have incorrect local_ip setting which then broke connectivity to my OpenStack instances. I could see GRE tunnels using wrong IPs:
Interface "..."
Interface patch-tun
type: patch
options: {peer=patch-int}
Interface br-int
type: internal
Interface "gre-9"
type: gre
options: {in_key=flow, local_ip= "192.168. 122.1", out_key=flow, remote_ip="..."}
Interface br-tun
type: internal
Interface "gre-2"
type: gre
options: {in_key=flow, local_ip= "192.168. 122.1", out_key=flow, remote_ip="..."}
# ovs-vsctl show
xxxxxx
Bridge br-int
Port "..."
tag: 1
Port patch-tun
Port br-int
Bridge br-tun
Port "gre-9"
Port br-tun
Port "gre-2"
To test this I reverted change in ovs_neutron_ plugin. ini, restarted neutron- plugin- openvswitch- agent and connectivity came back immediately.