Comment 31 for bug 1519331

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 1519331] Re: Postfix cannot resolve DNS if network was unavailable when it was started, such as on a laptop

Hi Scott,

Adding the Debian bug on Cc:.

On Mon, May 15, 2017 at 10:12:41PM -0000, Scott Kitterman wrote:
> On Monday, May 15, 2017 08:49:42 PM you wrote:
> > On Mon, May 15, 2017 at 06:45:21PM -0000, Scott Kitterman wrote:
> > > I'm getting close to uploading a fix for this to Debian, so you might wait
> > > for that.

> > It looks like you've implemented this using the network-online.target
> > approach, which as you mentioned might not DTRT for the localhost-only use
> > case. Did you decide that this is negligible?

> That was the advice I got from the Debian systemd maintainers (that the impact
> would be negligible).

Ok, the analysis on the Debian bug looks rather shallow to me:

> - The penalty of pulling in network-online.target is simply that for the
> local case postfix is started a bit later then necessary during boot.

There's no reason that this *should* be true for an intermittently-online
machine. The network-online.target is specifically defined so that services
are not started until the network connection is actually up; or put another
way, if a system is booted and can't get a network connection, those
services are not started. We don't just start them at some random point,
that would defeat the purpose.

So the case I described is still not handled here - a postfix setup that has
no dependency on the network interface bring-up (for binding), such as the
default config, will nevertheless be blocked from starting until there is a
network, including in cases where this is much more than a mere startup
timing distinction.

Now, you can't have a single unit config that simultaneously meets Russell's
request from the original bug report, to defer startup until a given bind
address is available, and the case I describe above, where you care about
postfix running even when the network is not up. I would argue that the
case I outlined is more important to get right out of the box, since it
requires no changes to the default postfix config whereas Russell's use case
does. But it's your decision as maintainer which to support as the default;
I just won't SRU the network-online.target change into any Ubuntu stable
releases because it would introduce a regression.

> > For the case of a server which always has a network connection, this works
> > fine. For the case of a standalone system with no configured network
> > connection, it probably also works fine. But for the case of e.g. a laptop
> > that sometimes has network and sometimes doesn't, if the system comes up
> > without network, postfix will not start and you will not have local
> > delivery. Is this the behavior you expect with your change?

> I tested this and if you're using NetworkManager at least there's some magic
> that happens which causes systemd to restart postfix once the network is
> available. Part of the reason I was having so much trouble replicating
> problems others were seeing was getting NM to quit 'helping' as the test
> system I was using also has a desktop installed.

Do you really mean that it restarts postfix when the network is available,
or is it starting postfix for the first time? The expected behavior is that
postfix doesn't start at all until the network is up. This is managed via
/lib/systemd/system/NetworkManager-wait-online.service in both Ubuntu and
Debian.

(FWIW in the process of confirming this, I have identified a bug at least in
Ubuntu, related to LP: #1569649, whereby NetworkManager-wait-online is not
enabled on some systems that have been continuously upgraded from Ubuntu
pre-releases. I'm working on fixing this now.)

> > Ultimately I want to SRU this into affected stable Ubuntu releases, so would
> > want a regression-free change.
> >
> > I see you are also setting After=nss-lookup.target. For the bug reported
> > here - which is about DNS resolution specifically - would it not suffice to
> > have postfix declare this After=nss-lookup.target, and for systemd-resolved
> > to be sequenced before it?

> According to the Debian systemd people, the systemd-resolved is superfluous.
> It's nss-lookup.target that I wanted all along.

Well, in theory yes, but in practice I see nothing - including
systemd-resolved - that's wired up to this target in Debian or Ubuntu. The
nss-lookup target does nothing:

$ systemctl status nss-lookup.target
● nss-lookup.target - Host and Network Name Lookups
   Loaded: loaded (/lib/systemd/system/nss-lookup.target; static; vendor preset:
   Active: inactive (dead)
     Docs: man:systemd.special(7)
$ journalctl -u nss-lookup.target
-- No entries --
$

Oh, if you happen to also install bind, then you get this target. But
that's not the common case.

So your current unit deps work *only* because you are also depending on
network-online.target, and resolvconf handling happens before
network-online. The After=nss-lookup.target is a complete no-op. I think
we should fix that so that it's *not* a no-op, but that means touching a few
more moving pieces.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>