startnat.sh doesn't use correct net devices

Bug #1794142 reported by Jeff Lane on 2018-09-24
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
maas-cert-server
Undecided
Unassigned

Bug Description

While watching the debug session with Michael and Cisco today we noticed that NAT wasn't happening.

This is the startnat.sh script currently:

bladernr@galactica:~/development/git/maas-cert-server$ cat usr/sbin/startnat.sh
#!/bin/sh

INTERNAL_NET=eth0
EXTERNAL_NET=eth1

if [ -f /etc/maas-cert-server/config ] ; then
    . /etc/maas-cert-server/config
fi

echo 1 > /proc/sys/net/ipv4/ip_forward
/sbin/iptables -t nat -A POSTROUTING -o $EXTERNAL_NET -j MASQUERADE
/sbin/iptables -A FORWARD -i $EXTERNAL_NET -o $INTERNAL_NET -m state \
               --state RELATED,ESTABLISHED -j ACCEPT
/sbin/iptables -A FORWARD -i $INTERNAL_NET -o $EXTERNAL_NET -j ACCEPT

And this is grepping startnat.sh through the entire codebase:
bladernr@galactica:~/development/git/maas-cert-server$ grep -r startnat.sh *
debian/changelog: * Move startnat.sh and flushnat.sh scripts from root of project to
debian/changelog: * Update startnat.sh to pull internal & external ports from config file
debian/debinstall:cp -a usr/sbin/flushnat.sh usr/sbin/startnat.sh "$BUILD_ROOT/usr/sbin/"
debian/postinst:# startnat.sh command in /etc/rc.local and, if it's present, configure systemd
lib/systemd/system/certification-nat.service:ExecStart=/usr/sbin/startnat.sh

At no point do the INTERNAL_NET and EXTERNAL_NET settings from the config file get placed into startnat.sh, so no matter what you set INTERNAL_NET and EXTERNAL_NET to in the config file, startnat.sh always uses eth0 and eth1.

Jeff Lane (bladernr) on 2018-09-24
Changed in maas-cert-server:
status: New → Confirmed
Jeff Lane (bladernr) wrote :

Before running modified maniacs-setup
ubuntu@ubuntu-s-2vcpu-4gb-nyc3-01:~$ cat /usr/sbin/startnat.sh
#!/bin/sh

INTERNAL_NET=eth0
EXTERNAL_NET=eth1
ubuntu@ubuntu-s-2vcpu-4gb-nyc3-01:~$ grep _NET /etc/maas-cert-server/config
INTERNAL_NET=eth1
EXTERNAL_NET=eth0

ubuntu@ubuntu-s-2vcpu-4gb-nyc3-01:~$ sudo ./maniacs-setup

Running modified maniacs-setup
***************************************************************************
* Identified networks:
* INTERNAL: 10.132.47.193 on eth1
* EXTERNAL: 159.203.72.148
10.17.0.5 on eth0
*
* Is this correct (Y/n)? y

* Do you want to set up this computer to automatically start NAT (Y/n)? y
Created symlink /etc/systemd/system/multi-user.target.wants/certification-nat.service → /lib/systemd/system/certification-nat.service.

ubuntu@ubuntu-s-2vcpu-4gb-nyc3-01:~$ cat /usr/sbin/startnat.sh
#!/bin/sh

INTERNAL_NET=eth1
INTERNAL_NET=eth0

ethernet

Rod Smith (rodsmith) wrote :

The following lines in startnat.sh SHOULD be setting the values correctly:

if [ -f /etc/maas-cert-server/config ] ; then
    . /etc/maas-cert-server/config
fi

Please check for the existence of /etc/maas-cert-server/config, and that it contains the correct values, on the server experiencing the problem. This variable is set in an external config file so that it can be referenced by multiple scripts (currently startnat.sh and maniacs-setup) and so that users can modify it without modifying the scripts.

Jeff Lane (bladernr) wrote :

Michael, can you get with Cisco and re-verify this, or verify the behaviour we think we saw today.

As Rod points out, I was incorrect initially in that INTERNAL_NET and EXTERNAL_NET aren't directly changed in startnat.sh.

This could well also be another layer of complexity that's missing... in light of Rod's comments, it's worth a second look to try to work out exactly why packets are not going from the SUT through the MAAS VM and out into the internet.

Michael Reed (mreed8855) wrote :

I will verify this issue on the Cisco set up tomorrow. I have seen this issue in the past using VM's and typically changing the the values of INTERNAL_NET and EXTERNAL_NET in /usr/sbin/startnat.sh has worked to get the from the SUT, to the outside world, via the MAAS VM. Typically I have had to make this change and re-run the script after power interruptions or rebooting the MAAS VM.

Michael Reed (mreed8855) wrote :

Some additional information

Host System 16.04.4
MAAS VM 18.04 and 18.04.1
MAAS Version 2.4.2 (also seen on 2.4 beta)

The cisco setup is more up-to-date but here is some additional information on an older setup that I have seen this on.

$ cat version.txt
maas-cert-server:
  Installed: 0.3.1-0~201805011616~ubuntu18.04.1
  Candidate: 0.3.5-0~201809070116~ubuntu18.04.1
  Version table:
     0.3.5-0~201809070116~ubuntu18.04.1 500
        500 http://ppa.launchpad.net/hardware-certification/public/ubuntu bionic/main amd64 Packages
 *** 0.3.1-0~201805011616~ubuntu18.04.1 100
        100 /var/lib/dpkg/status
maas:
  Installed: 2.4.0~beta2-6865-gec43e47e6-0ubuntu1
  Candidate: 2.4.2-7034-g2f5deb8b8-0ubuntu1
  Version table:
     2.4.2-7034-g2f5deb8b8-0ubuntu1 500
        500 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages
 *** 2.4.0~beta2-6865-gec43e47e6-0ubuntu1 500
        500 http://archive.ubuntu.com/ubuntu bionic/main amd64 Packages
        100 /var/lib/dpkg/status

Michael Reed (mreed8855) wrote :

This may be a false alarm, when going through the yaml file in /etc/netplan on the MAAS VM and I noticed that the nameserver for the internal network was configured for an external network. I had Cisco remove those settings from the 172.x.x.x network entry. I should have confirmation tomorrow if this was in fact the issue. It now makes sense to me as to why the traffic was not resolved through the maas server once it was NAT'ed regardless of the hack. The yaml file now resembles the following:

network:
    version: 2
    ethernets:
        ens3:
            addresses:
              - 172.16.0.1/24
            dhcp4: no
        ens8:
            dhcp4: true

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers