Comment 64 for bug 1707999

Revision history for this message
Andres Rodriguez (andreserl) wrote : Re: [Bug 1707999] Re: [SRU] iPXE doesn't handle NAK requests when multiple DHCP server's offer

no need to re-verify xenial, this is only affected in zesty./

On Wed, Nov 29, 2017 at 12:10 PM, Jason Hobbs <email address hidden>
wrote:

> Then we would need a new version for xenial too right, and have to
> re-verify it? Can we just skip zesty?
>
> On Wed, Nov 29, 2017 at 5:45 PM, Steve Langasek <
> <email address hidden>> wrote:
>
> > No, you need to upload with a new version number. You can't reuse a
> > version number in launchpad.
> >
> > --
> > You received this bug notification because you are subscribed to the bug
> > report.
> > https://bugs.launchpad.net/bugs/1707999
> >
> > Title:
> > [SRU] iPXE doesn't handle NAK requests when multiple DHCP server's
> > offer
> >
> > Status in MAAS:
> > Invalid
> > Status in MAAS 2.2 series:
> > Invalid
> > Status in ipxe package in Ubuntu:
> > Fix Released
> > Status in ipxe source package in Xenial:
> > Fix Committed
> > Status in ipxe source package in Zesty:
> > Fix Committed
> > Status in ipxe source package in Artful:
> > Fix Released
> > Status in ipxe source package in Bionic:
> > Fix Released
> >
> > Bug description:
> > [Impact]
> > When there are multiple DHCP servers on the network, iPXE doesn't
> handle
> > NAK's for the DHCP servers. This causes iPXE to get blocked without
> > attempting to re-discover, hence, never obtaining an IP address.
> >
> > For example, in a MAAS HA environment with a DHCP master/slave
> > configuration, the machine fails to PXE boot because at a certain
> > point, the DHCP server is not fully in sync, which causes iPXE to get
> > a NAK request. This prevents the machine from PXE booting.
> >
> > [Test case]
> > The easiest way:
> > 1. Install MAAS with two rack controllers
> > 2. Configure HA
> > 3. PXE boot KVM's.
> >
> > [Regression Potential]
> > Minimal. This only ensures that iPXE attempts to re-discover the
> network
> > when it receives a NACK.
> >
> > [Original bug report]
> > A VM failed to PXE boot after receiving multiple DHCP offers.
> >
> > You can see this here on a log from the secondary controller:
> > http://paste.ubuntu.com/25221939/
> >
> > The node is offered both 10.245.208.201 and 10.245.208.120, tries to
> > get 10.245.208.120, and is refused.
> >
> > One strange thing is that it seems like the DHCP server on both the
> > primary controller and the secondary controller are responding. The
> > primary controller's log doesn't have the offer for 10.245.208.120 - only
> > the offer for 10.245.208.201:
> > http://paste.ubuntu.com/25221952/
> >
> > This is in an HA setup: region API's are at 10.245.208.30,
> > 10.245.208.31 and 10.245.208.32. We're using hacluster to load
> > balance, and a VIP in front at 10.245.208.33. There are rack
> > controllers on 10.245.208.30 and 10.245.208.31. For the untagged vlan
> > this VM is trying to boot from, 10.245.208.30 is set as the primary
> > controller, and 10.245.208.31 is set as the secondary.
> >
> > Primary postgres is on 10.245.208.30, it's being replicated to backup
> > postgres on 10.245.208.31. It has a VIP at 10.245.208.34.
> >
> > We don't hit this everytime - on this deployment only one machine out
> > of about 30 hit this.
> >
> > We've also seen this on single node MAAS setups - non HA. So, it's
> > not an HA specific issue.
> >
> > I've attached logs from the maas servers.
> >
> > To manage notifications about this bug go to:
> > https://bugs.launchpad.net/maas/+bug/1707999/+subscriptions
> >
>
> --
> You received this bug notification because you are subscribed to MAAS.
> https://bugs.launchpad.net/bugs/1707999
>
> Title:
> [SRU] iPXE doesn't handle NAK requests when multiple DHCP server's
> offer
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/maas/+bug/1707999/+subscriptions
>
> Launchpad-Notification-Type: bug
> Launchpad-Bug: product=maas; milestone=2.3.0beta3; status=Invalid;
> importance=Critical; <email address hidden>;
> Launchpad-Bug: product=maas; productseries=2.2; milestone=2.2.3;
> status=Invalid; importance=Critical; assignee=None;
> Launchpad-Bug: distribution=ubuntu; sourcepackage=ipxe; component=main;
> status=Fix Released; importance=Critical; assignee=christian.ehrhardt@
> canonical.com;
> Launchpad-Bug: distribution=ubuntu; distroseries=xenial;
> sourcepackage=ipxe; component=main; status=Fix Committed;
> importance=Critical; assignee=None;
> Launchpad-Bug: distribution=ubuntu; distroseries=zesty;
> sourcepackage=ipxe; component=main; status=Fix Committed;
> importance=Critical; assignee=None;
> Launchpad-Bug: distribution=ubuntu; distroseries=artful;
> sourcepackage=ipxe; component=main; status=Fix Released;
> importance=Undecided; assignee=None;
> Launchpad-Bug: distribution=ubuntu; distroseries=bionic;
> sourcepackage=ipxe; component=main; status=Fix Released;
> importance=Undecided; assignee=None;
> Launchpad-Bug-Tags: cdo-qa cdo-qa-blocker cdo-release-blocker
> foundations-engine internal patch verification-done-xenial
> verification-failed-zesty verification-needed
> Launchpad-Bug-Information-Type: Public
> Launchpad-Bug-Private: no
> Launchpad-Bug-Security-Vulnerability: no
> Launchpad-Bug-Commenters: andreserl blake-rouse brian-murray cgregan
> davidpbritton ivoks janitor jason-hobbs kiko paelzer vorlon
> Launchpad-Bug-Reporter: Jason Hobbs (jason-hobbs)
> Launchpad-Bug-Modifier: Jason Hobbs (jason-hobbs)
> Launchpad-Message-Rationale: Subscriber (MAAS)
> Launchpad-Message-For: andreserl
>

--
Andres Rodriguez (RoAkSoAx)
Ubuntu Server Developer
MSc. Telecom & Networking
Systems Engineer