propagate source routing rules to container machines (LXD, KVM) as of MAAS 2.6

Bug #1829150 reported by Dmitrii Shcherbakov on 2019-05-15
36
This bug affects 7 people
Affects Status Importance Assigned to Milestone
juju
High
Joseph Phillips

Bug Description

As of MAAS 2.6 MAAS will provide source routing configuration to machines to make them send data out the same interface the traffic was received on.

Specifically, for responses it will work for TCP sockets bound to a specific address or 0.0.0.0 and for UDP sockets bound to specific addresses (not 0.0.0.0).

https://github.com/maas/maas/commit/442d47053e6f96bf5a94904f16968e9e5e5c965c
https://github.com/maas/maas/commit/45f2632b8164f105eab69baa88ee401cf0f68b56

Example netplan (routing-policy is only added for interfaces that have an address with a subnet that has a default gateway defined in MAAS):
https://paste.ubuntu.com/p/rm9wD949G2/

This information needs to be propagated to LXD containers and KVMs created by Juju machine agents if they are associated with relevant spaces.

Changed in juju:
status: New → Triaged
importance: Undecided → High
milestone: none → 2.7-beta1
Changed in juju:
milestone: 2.7-beta1 → 2.7-rc1
Changed in juju:
milestone: 2.7-rc1 → 2.7.1
Changed in juju:
milestone: 2.7.1 → 2.7.2
Ante Karamatić (ivoks) on 2020-01-17
tags: added: sts
José Pekkarinen (koalinux) wrote :

Subscribed field critical.

Changed in juju:
assignee: nobody → Joseph Phillips (manadart)
John A Meinel (jameinel) wrote :

This feels like a potentially dangerous feature request being hidden as a field-critical bugfix. I have some concern that rushing it into a 2.7.2 release is as likely to bite us because someone else didn't realize it would affect them and when they upgrade something will break.

There are things like tracking the "next table route id". And if a given deployment has been adding routes directly, then we will be interacting with their rules. (I guess hopefully they wouldn't add source routing in MAAS if they were setting source routing directly)

There are also questions of things like "what platforms support this". I'd be much more comfortable with "this is high priority, and needs to preempt some of our existing roadmap" than "push out a critical fix as fast as possible".

John A Meinel (jameinel) wrote :

Also if we are dealing with Series support, what happens if the container we are deploying has a different series than the host machine. (bionic container on xenial machine, or xenial container on bionic machine.)
I think they are all answerable, but critical bugfix to get out before the weekend is going to have a hard time being aware of, much less answering them all.

Richard Harding (rharding) wrote :

Just to clarify, in the side email around this the main question was "if this isn't going to work in Juju right now what would be a workaround?". So the goal is to investigate what the new feature in MAAS is, how it works, is it as trivial as adding to things we already proxy into containers on a MAAS environment, and if not, what would be a suitable workaround path until we can plot out a proper update.

If it is a small update to existing information set into containers let's see if we can get it into the upcoming 2.7.2, but if not, let's help with workaround steps until we can get it in.

Joseph Phillips (manadart) wrote :

It looks like we are going to need to do the following.

- Expand the network.InterfaceInfo type so it can hold the routing data.
- Update gomaasapi to get this data from MAAS.
- Store the data when we run the instance poller.
- Pass the device information to the container manager after bridge policy runs.
- Include the InterfaceInfo routing information when we render Netplan in the container.

Richard Harding (rharding) wrote :

Given the updates required and the validation we'd need to do this isn't going to be able to be a quick fix for the 2.7.2 release that's upcoming. Our expectation is it'll take several PRs and QA of that to make sure we've got a solid fix.

In light of that, to help work around the issue I wanted to see if there's a way to leverage a cloud-init script that would run on containers (that are bionic+) to help seed the routing information. There was also talk of a possible charm solution as a subordinate or the like. I think we need to explore these while the work in Juju goes forward.

Changed in juju:
milestone: 2.7.2 → 2.8-beta1

On Thu, 2020-01-30 at 16:11 +0000, Richard Harding wrote:
> Given the updates required and the validation we'd need to do this
> isn't
> going to be able to be a quick fix for the 2.7.2 release that's
> upcoming. Our expectation is it'll take several PRs and QA of that to
> make sure we've got a solid fix.
>
> In light of that, to help work around the issue I wanted to see if
> there's a way to leverage a cloud-init script that would run on
> containers (that are bionic+) to help seed the routing information.
> There was also talk of a possible charm solution as a subordinate or
> the
> like.

Some clouds use this charm as a workaround.
https://jaas.ai/u/canonical-bootstack/policy-routing

Ian Booth (wallyworld) on 2020-04-02
Changed in juju:
milestone: 2.8-beta1 → 2.8.1
Tim Penhey (thumper) wrote :

Taking field-critical off as this is a feature request.

Changed in juju:
milestone: 2.8.1 → 2.8-next
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers