noble: needrestart triggering SIGTERM of cloud-final.service preventing apt packages from being installed when cloud-init is also being upgraded

Bug #2059337 reported by Chad Smith
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
Invalid
Undecided
Unassigned
Noble
Won't Fix
Undecided
Unassigned
needrestart (Ubuntu)
Fix Released
Undecided
Unassigned
Noble
Fix Released
Undecided
Unassigned

Bug Description

Recent downstream ubuntu-specific changes in needrestart version 3.6-7ubuntu1 [1] set Ubuntu into autorestart mode when non-interactive apt-get dist-upgrade is being performed.

This causes an acute issue for cloud-init when #cloud-config user-data tries to perform apt-get dist-upgrade and package installs via user-data[2] like the following:

#cloud-config
package_update: true
package_upgrade: true
packages: [sl]

Since cloud-init runs apt-get dist-upgrade in cloud-final.service in non-interactive mode, Ubuntu's behavior looks to set the default "opt_r" mode to "a" (automatic). This causes problems when cloud-init package is also being upgraded by dist-upgrade as needrestart will collect the currently running cloud-final.service and determine it is a target for automtic restart.

The immediate SIGTERM of cloud-final.service prevents cloud-init from completing any of the remaining config modules in the cloud-final boot stage, notably, the additional package installs requested by the `packages:` directive in user data.

Given that cloud-final.service is a oneshot service, that can spawn apt-get dist-upgrade. I'd propose that minimally Ubuntu initially carries a downstream patch to skip automated restart of cloud-final.service[3]. I don't see an easier way to inject other skip regex into /etc/needrestart/needrestart.conf via /etc/needrestart/conf.d/cloud-init.conf that would allow us to augment the list of skip regexs on a per-package basis.

References:

[1] Ubuntu needrestart setting automatic restart mode when non-interactive on Ubuntu https://git.launchpad.net/ubuntu/+source/needrestart/tree/debian/patches/ubuntu-mode.patch?id=0ee54f20335d49ba3c330e5f8328e88a8cc3f99b#n72
[2] Upstream cloud-init bug: cloud-final.service getting sigterm before installing packages https://github.com/canonical/cloud-init/issues/5109
[3] downstream packaging proposal: https://code.launchpad.net/~chad.smith/ubuntu/+source/needrestart/+git/needrestart/+merge/463236

Related branches

Revision history for this message
David Myers (demyers) wrote (last edit ):

I believe the patch referenced above causes other bad behaviors.

Specifically, it causes systemd-networkd to be restarted without any sort of prompt whenever a library it links with receives a security update. In my experience restarting systemd-networkd can break active WireGuard tunnels and can cause chronyd to stop polling IPv6 servers.

I think the change at issue is adding the flags "-m u" to apt-pinvoke in /etc/apt/apt.conf.d/99needrestart, which also means needrestart now ignores a setting of "NEEDRESTART_MODE=l" in the environment when run from apt.

I've started to add systemd-networkd to my needrestart ignore list, but perhaps that should be a default setting, as it is for NetworkManager.

I'm testing with Noble in a LXD VM with an image from the ubuntu-daily repository.

Thanks.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in needrestart (Ubuntu):
status: New → Confirmed
Revision history for this message
Chad Smith (chad.smith) wrote :

Thanks @demyers for context here too. I've just proposed the following for cloud-final.service that seems to work on my side when testing in lxd

https://code.launchpad.net/~chad.smith/ubuntu/+source/needrestart/+git/needrestart/+merge/463236

description: updated
Changed in cloud-init (Ubuntu):
status: New → Confirmed
Revision history for this message
Chad Smith (chad.smith) wrote :

> think the change at issue is adding the flags "-m u" to apt-pinvoke in /etc/apt/apt.conf.d/99needrestart, which also means needrestart now ignores a setting of "NEEDRESTART_MODE=l" in the environment when run from apt.

@demyers I also saw this behavior too in initial testing and didn't understand what I was seeing. I did find that I had to provide NEEDRESTART_SUSPEND=yes to apt-get dist-upgrade to actually avoid any needrestart behavior.

Revision history for this message
David Myers (demyers) wrote :

@chad.smith I used to use "NEEDRESTART_MODE=l" so needrestart would print a list of suggested restarts and I could then run it interactively if I needed to, but now I don't get a choice. needrestart always restarts without asking.

So for now I put a bunch of processes to protect in a file in /etc/needrestart/conf.d, as part of my cloud-init user-data of course. :-)

Revision history for this message
Chad Smith (chad.smith) wrote (last edit ):

:) Thanks again David for the context on your immediate workarounds you would pursue here in the meantime while needrestart folks determine best course of action for Ubuntu.

While I think one solution could be appending all critical or oneshot services of the world into upstream's ex/needrestart.conf, just like my linked Merge request suggested for ubuntu downstream, it's probably better to pursue individual packages delivering their own supplemental /etc/needrstart/conf.d/<pkg>.conf as you alluded to.

I had a hard time understanding how to use /etc/needrestart/conf.d but happened upon https://github.com/liske/needrestart/issues/184#issuecomment-650402676 which explained pushing another element onto the default config's $nrconfi{nblacklist_interp} array.

It was a small step further for cloud-init to deliver a file to /etc/needrestart/config/cloud-init.conf with the content:

# Add a needrestart skip for cloud-final.service across APT upgrade operations.
# This avoids SIGTERMs disrupting cloud-init before it is able to install
# packages and/or setup PPAs. LP: #2059337
$nrconf{override_rc}->{qr(^cloud-final\.service$)} = 0;

Just tested and that this ensures cloud-final.service will be skipped instead of autorestarted anytime it happens to be running as needrestart is invoked. Then we can allow Ubuntu's needrestart definitions of default behavior get sorted separately than the cloud-init specific issue. And cloud-init doesn't necessarily have to await either a downstream ubuntu fix for needrestart auto-restarting or upstream's default ex/needrestart.conf changes.

Revision history for this message
Brett Holman (holmanb) wrote :

Just to recap some thoughts from the conversation cloud-init devs just had:

- Automatically killing non-interactive processes may have far-reaching changes. It may leave many packages besides cloud-init exposed. Any service that calls `apt upgrade` etc non-interactively may get killed. How confident do we feel that we'll catch them all before the noble release?
- One possible workaround in needrestart for this issue would be to automatically include its own parent PID in the list of ignored processes.
- If we do decide to keep the needsrestart patch as-is, cloud-init should really include all of its own services in the ignore list - cloud-init is not a long-running process and never wants to be restarted.

Steve Langasek (vorlon)
tags: added: foundations-todo
Chad Smith (chad.smith)
description: updated
Revision history for this message
Chad Smith (chad.smith) wrote :

In foundations leadership sync on this today. I believe we determined that the best course of action we should take at the moment is to pursue an Ubuntu downstream needrestart patch to register cloud-init system services as skipped during automatic restarts when apt commands are run in non-interactive mode as it would break the rest of cloud-init configuration being performed by that boot stage.

We also would prefer to avoid each package maintainer having to extend the $nrconf{override_rc} for their systemd services as those behaviors would be better advertized and maintained in the ubuntu downstream package needrestart.conf as a single-source of truth for "automated restart skips"

As a result, I've closed https://github.com/canonical/cloud-init/pull/5111 as we will not package our own restart skip extensions/overrides in cloud-init deb.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package needrestart - 3.6-7ubuntu4

---------------
needrestart (3.6-7ubuntu4) noble; urgency=medium

  * d/p/ubuntu-avoid-restart-cloud-final.patch:
    - avoid automatic restart of cloud-init systemd oneshot services when
      cloud-init invokes apt-get dist-upgrade due to user-data (LP: #2059337)

 -- Chad Smith <email address hidden> Wed, 27 Mar 2024 16:51:58 -0600

Changed in needrestart (Ubuntu Noble):
status: Confirmed → Fix Released
Revision history for this message
Chad Smith (chad.smith) wrote :

Closing cloud-init task for this bug as cloudimages created after 20240408 contain appropriate needrestart to defer cloud-final.service during cloud-init package upgrade.

Changed in cloud-init (Ubuntu Noble):
status: Confirmed → Won't Fix
Revision history for this message
Chad Smith (chad.smith) wrote :

Marking won't fix as the resolution was in needrestart not cloud-init

Chad Smith (chad.smith)
Changed in cloud-init (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.