MAAS does not detect properly if Ubuntu is using upstart/systemd

Bug #1732703 reported by Victor Tapia on 2017-11-16
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MAAS
Undecided
Unassigned
1.9
Critical
Unassigned
maas (Ubuntu)
Critical
Unassigned
Trusty
Critical
Unassigned
snapd (Ubuntu)
Undecided
Unassigned
Trusty
Undecided
Unassigned
systemd (Ubuntu)
Undecided
Unassigned
Trusty
Critical
Unassigned

Bug Description

[impact]
Since Trusty uses upstart by default, MAAS manages its services with upstart. However, when a user installs systemd (even if it is not used as the init system), MAAS detects systemd installed and tries to manage its services via systemd. This obviously creates issues and prevents MAAS from working.

[Test Case]
1. Install & configure MAAS
2. Add machines
3. install systemd
4. MAAS will fail to manage machines

[Regression potential]
Minimal. This just ensures that upstart is detected correctly even if systemd is installed (but not used).

[Original bug report]
Trusty uses upstart by default, and installing snapd (e.g. for livepatch purposes), pulls systemd too. In this setup, upstart is _not_ replaced by systemd, but MAAS "detects" systemd as init because of the existence of /run/systemd/system:

@src/provisioningserver/utils/__init__.py:505

SYSTEMD_RUN_PATH = '/run/systemd/system'

def get_init_system():
    """Returns 'upstart' or 'systemd'."""
    if os.path.exists(SYSTEMD_RUN_PATH):
        return 'systemd'
    else:
        return 'upstart'

One possible solution would be to check if /sbin/init is a symlink pointing to /lib/systemd/systemd:

def get_init_system():
    """Returns 'upstart' or 'systemd'."""
    initpath = os.readlink("/sbin/init")
    if (initpath == "/lib/systemd/systemd"):
        return 'systemd'
    else:
    return 'upstart'

Other affected parts of the code are the postinst files for maas-proxy and maas-dhcp (debian/maas-proxy.postinst debian/maas-dhcp.postinst), throwing an error if maas is installed after systemd in Trusty

Related branches

Changed in maas:
status: New → Incomplete
status: Incomplete → Won't Fix
Andres Rodriguez (andreserl) wrote :

Hi Victor,

Can you clarify what issues you are having? It is not clear from the bug the issues you are having.

That said, MAAS was never enabled to work on a system with both systemd + upstart, where upstart would remain the init system. Ubuntu, at the time, was not even fully prepared to use systemd. This is a grey area where it is difficult to fix something that is not supported, with a high regression potential.

Victor Tapia (vtapia) wrote :

Hi Andres,

At the moment, I can see the service monitor complain:

Nov 16 12:36:48 trusty-maas maas.dhcp: [ERROR] DHCPv4 server failed to restart (for network interfaces eth0): Unable to parse the output from systemd for service 'maas-dhcpd'.
Nov 16 12:36:52 trusty-maas maas.boot_image_download_service: [ERROR] Failed to download images: Unable to parse the output from systemd for service 'tgt'.
Nov 16 12:38:38 trusty-maas maas.service_monitor: [ERROR] While monitoring service 'maas-dhcpd' an error was encountered: Unable to parse the output from systemd for service 'maas-dhcpd'.
Nov 16 12:38:38 trusty-maas maas.service_monitor: [ERROR] While monitoring service 'maas-dhcpd6' an error was encountered: Unable to parse the output from systemd for service 'maas-dhcpd6'.
Nov 16 12:38:38 trusty-maas maas.service_monitor: [ERROR] While monitoring service 'tgt' an error was encountered: Unable to parse the output from systemd for service 'tgt'.

I'm currently building a reproducer to see if commissioning/deploying is somehow affected.

Victor Tapia (vtapia) wrote :

Trying to enlist a machine when systemd+upstart are installed in trusty, throws this error:

==> /var/log/maas/clusterd.log <==
2017-11-21 11:42:45+0000 [-] Unhandled failure dispatching AMP command. This is probably a bug. Please ensure that this error is handled within application code or declared in the signature of the RemoveHostMaps command. [trusty-maas:pid=844:cmd=RemoveHostMaps:ask=25]
        Traceback (most recent call last):
          File "/usr/lib/python2.7/threading.py", line 783, in __bootstrap
            self.__bootstrap_inner()
          File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
            self.run()
          File "/usr/lib/python2.7/threading.py", line 763, in run
            self.__target(*self.__args, **self.__kwargs)
        --- <exception caught here> ---
          File "/usr/lib/python2.7/dist-packages/twisted/python/threadpool.py", line 191, in _worker
            result = context.call(ctx, function, *args, **kwargs)
          File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
            return self.currentContext().callWithContext(ctx, func, *args, **kw)
          File "/usr/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
            return func(*args,**kw)
          File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/twisted.py", line 200, in wrapper
            return func(*args, **kwargs)
          File "/usr/lib/python2.7/dist-packages/provisioningserver/rpc/dhcp.py", line 237, in remove_host_maps
            if not _is_dhcpv4_managed_and_active(CannotRemoveHostMap):
          File "/usr/lib/python2.7/dist-packages/provisioningserver/rpc/dhcp.py", line 164, in _is_dhcpv4_managed_and_active
            if service_monitor.get_service_state("dhcp4") != SERVICE_STATE.ON:
          File "/usr/lib/python2.7/dist-packages/provisioningserver/utils/twisted.py", line 200, in wrapper
            return func(*args, **kwargs)
          File "/usr/lib/python2.7/dist-packages/provisioningserver/service_monitor.py", line 120, in get_service_state
            return self._get_service_status(service)[0]
          File "/usr/lib/python2.7/dist-packages/provisioningserver/service_monitor.py", line 216, in _get_service_status
            service.service_name)
          File "/usr/lib/python2.7/dist-packages/provisioningserver/service_monitor.py", line 265, in _get_systemd_service_status
            service_name))
        provisioningserver.service_monitor.ServiceParsingError: Unable to parse the output from systemd for service 'maas-dhcpd'.

==> /var/log/maas/regiond.log <==
2017-11-21 11:42:45 [-] Error on request (63) node.create: ('UNHANDLED', 'Unknown Error [trusty-maas:pid=844:cmd=RemoveHostMaps:ask=25]')
        Traceback (most recent call last):
        Failure: twisted.protocols.amp.UnhandledCommand: ('UNHANDLED', 'Unknown Error [trusty-maas:pid=844:cmd=RemoveHostMaps:ask=25]')

A similar error is shown when trying to commission (same ServiceParsingError for maas-dhcpd). Removing systemd, or patching get_init_system() to detect the proper service file, makes the issue disappear.

Victor Tapia (vtapia) wrote :

Commissioning logs: https://pastebin.canonical.com/203632/

The previous patch gave an error when only upstart was running, this should cover that case too (thanks ivanhitos):

def get_init_system():
    """Returns 'upstart' or 'systemd'."""
    if os.path.islink("/sbin/init"):
        initpath = os.readlink("/sbin/init")
        if initpath == "/lib/systemd/systemd":
            return 'systemd'
        else:
            return 'upstart'
    else:
        return 'upstart'

Drew Freiberger (afreiberger) wrote :

I can confirm that patch in #4 works on MAAS 1.9.5 on trusty.

description: updated
Changed in maas (Ubuntu):
importance: Undecided → Critical
Robie Basak (racb) wrote :

Since snapd pulling in systemd was introduced in an SRU, this is a regression in a stable caused by an SRU and therefore should be considered regression-update.

tags: added: regression-update
Robie Basak (racb) wrote :

Actually, if snapd isn't installed by default on Trusty, perhaps it we shouldn't treat it as a regression caused by snapd as the user is explicitly pulling in snapd and therefore systemd rather than it being imposed automatically by the SRU. I'll leave it for others to decide this though.

Robie Basak (racb) wrote :

15:15 <rbasak> roaksoax: in bug 1732703, I'm not sure that test for systemd is reliable. For example immediately after an upgrade from Trusty to Xenial, before a reboot, /sbin/init will be systemd but the running init system will be upstart.
15:15 <ubottu> bug 1732703 in MAAS 1.9 "MAAS does not detect properly if Ubuntu is using upstart/systemd" [Critical,In progress] https://launchpad.net/bugs/1732703
15:16 <rbasak> roaksoax: AFAICT, the "correct" way to test for systemd is its process name or something, but I can't find a good reference.
15:16 <rbasak> roaksoax: so I'm not really sure what is correct. I'll happily accept whatever a systemd-type person here says.
15:16 <cjwatson> There should be something in one of the init script layers
15:17 <cjwatson> /lib/lsb/init-functions.d/40-systemd
15:17 <cjwatson> if [ -d /run/systemd/system ]; then
15:17 <rbasak> Ah
15:17 <cjwatson> I believe that's the canonical method
15:17 <rbasak> I recalled something about something in /run but was unable to find it.
15:18 <rbasak> Thanks
15:18 <cjwatson> rbasak: Found the documentation for that - sd_booted(3)
15:19 <cjwatson> under NOTES

Robie Basak (racb) on 2017-12-13
Changed in systemd (Ubuntu Trusty):
importance: Undecided → Critical
Steve Langasek (vorlon) wrote :

snapd will not function without systemd as a deputy init. nothing to fix in snapd.

Changed in snapd (Ubuntu):
status: New → Won't Fix
Robie Basak (racb) wrote :

Further extensive discussion at https://irclogs.ubuntu.com/2017/12/13/%23ubuntu-devel.html#t15:15

Our current belief is that the bug is in the systemd package, introduced by the SRU tracked in bug 1656280. That update broke our standard test for determining if we're on a systemd system by creating /run/systemd/system when the systemd package is installed even if pid 1 is upstart.

It looks like this regressed eleven months ago but was only reported last month, so I don't think the MAAS fix is critical. I would prefer to get this fixed properly in systemd because it may have a wider impact than just MAAS.

It is correct, in general, to check for /run/systemd/system to detect if systemd manages pid 1.

Imho deputy systemd (used by snapd, on trusty, with xenial-lts kernel) should not have been creating that, however I fear that without that directory snapd and snaps therein may get confused (in classic confinment).

It is true that trusty only uses upstart as pid1 with no other options; and any system systemd jobs are inert (deputy systemd only looks for deputy things).

Note that although xenial ships both upstart & systemd; only systemd is supported as pid1 on all form-factors. (upstart as pid1 is only supported on xenial ubuntu touch product, which is now end-of-life).

Possibly we could create one more directory e.g. /run/dsystemd/system or some such, which maas can check for to destinguish "systemd or deputy-systemd".

However, checking /sbin/init like done in the proposed merge proposal is very adequate for maas needs, and should yield correct results. As far as I can tell, since on xenial either upstart-sysv or systemd-sysv may provide /sbin/init, with systemd-sysv being the default everywhere. (upstart-sysv on xenial is for ubuntu touch only).

Ideally, I do not want to touch deputy systemd uploads.

This is quite unique to maas; as no other software is getting backports with explicit features that enables systemd support on trusty. E.g. cloud-init backports are done in such a way, that when compiled on trusty, no systemd support is installed nor available.

You may want to choose to make init-system selection code innert, as a compile time / dependency option, such that on trusty there is no dynamic selection of things at all.

Robie Basak (racb) wrote :

I've deferred my decision as to whether to accept what's currently in the queue on this thread: https://lists.ubuntu.com/archives/ubuntu-devel/2017-December/040093.html

Andres Rodriguez (andreserl) wrote :

FWIW, this is currently affecting customers who are running MAAS and require livepatch.

Comments #11 and #12 above confirm that the patch is enough for the MAAS needs. Whichever way MAAS decides to check for systemd is up to MAAS and that is not a reason to block an SRU provided that it does not impact any other piece of software. That said, this patch does not does not introduce a regression to MAAS nor any other software.

Lastly, this patch is *only* for 1.9 as this code path is only available in Trusty, so upgrades to later Ubuntu releases will yield on using a newer version of MAAS that doesn't rely on this code path.

That said, there's no supported way in Ubuntu that will symlink /sbin/init -> /lib/systemd/systemd , provided that systemd-sysv is *only* available in Xenial, and again, upgrades to Xenial will result in MAAS not using this codepath at all.

On Wed, Dec 13, 2017 at 10:34:25PM -0000, Andres Rodriguez wrote:
> FWIW, this is currently affecting customers who are running MAAS and
> require livepatch.

It's been affecting users since January, no? Why the sudden urgency?
What difference will a week or two make?

> Comments #11 and #12 above confirm that the patch is enough for the MAAS
> needs. Whichever way MAAS decides to check for systemd is up to MAAS and
> that is not a reason to block an SRU provided that it does not impact
> any other piece of software. That said, this patch does not does not
> introduce a regression to MAAS nor any other software.

I think that's quite a brave claim to make. I'm sure "does not
introduce a regression" was a claim that might have been made in the
systemd SRU that regressed this too, and yet here we are.

> Lastly, this patch is *only* for 1.9 as this code path is only available
> in Trusty, so upgrades to later Ubuntu releases will yield on using a
> newer version of MAAS that doesn't rely on this code path.

If we did decide to SRU an emergency fix as a stopgap for MAAS' use
case, and it's for Trusty only, then why have a test at all? Can we just
return 'upstart' without a test?

To be clear, I'm not demanding or even requesting anything specific
right now. I don't feel that a case for urgency has yet been made, given
the currently known regression timeline. In the meantime I think it's
worth understanding how we want to fix this properly, because requiring
multiple SRUs while we swing back and forth is bad for everyone, and
equally I don't want to see us locked into a suboptimal solution.

If a case is accepted on the basis of urgency, or another SRU team
member disagrees with my assessment, I wouldn't want to block that. Feel
free to land what you think is appropriate.

Steve Langasek (vorlon) wrote :

I would accept a version of this SRU that hard-codes the choice of upstart as the init system on 14.04, because that is the only init system supported in that version of Ubuntu.

We can discuss further whether the deput-init systemd package in trusty needs further changes to not fall afoul of common init system detection techniques, but I don't think that should block fixing MAAS in this scenario.

Currently, all the proposed methods of detecting the init system have corner cases where they break. Most of these corner cases are negligible for maas, however.

The one option I know that doesn't have corner cases is to invoke '/sbin/initctl version' and check its return code. But I leave it up to the MAAS team whether to implement this vs. a lighter-weight check.

Changed in maas (Ubuntu):
status: New → Won't Fix
Changed in maas (Ubuntu Trusty):
status: New → Triaged
importance: Undecided → Critical
Andres Rodriguez (andreserl) wrote :
Download full text (4.5 KiB)

On Wed, Dec 13, 2017 at 6:00 PM Robie Basak <email address hidden>
wrote:

> On Wed, Dec 13, 2017 at 10:34:25PM -0000, Andres Rodriguez wrote:
> > FWIW, this is currently affecting customers who are running MAAS and
> > require livepatch.
>
> It's been affecting users since January, no?

This hasn’t been affecting users since January. This bug has been reported
in November and only affects users running MAAS who someway or another
installed systemd. In this particular case, on November a customer
installed live patch on a system, hence the issue.

Why the sudden urgency?
> What difference will a week or two make?
>
> > Comments #11 and #12 above confirm that the patch is enough for the MAAS
> > needs. Whichever way MAAS decides to check for systemd is up to MAAS and
> > that is not a reason to block an SRU provided that it does not impact
> > any other piece of software. That said, this patch does not does not
> > introduce a regression to MAAS nor any other software.
>
> I think that's quite a brave claim to make. I'm sure "does not
> introduce a regression" was a claim that might have been made in the
> systemd SRU that regressed this too, and yet here we are.

There is no supported way on Ubuntu Trusty (nor package in the archive)
that will create a symlink of /sbin/init to systemd. This only happens By
the systemd-sysv package which is only available in Xenial. So, since
systemd is not supported as a init system in trusty and this would only
happen if a user manually does this, then this doesn’t introduce any
regressions in MAAS. So it is not a brave claim to make, it is a claim
based on facts.

>
>
> > Lastly, this patch is *only* for 1.9 as this code path is only available
> > in Trusty, so upgrades to later Ubuntu releases will yield on using a
> > newer version of MAAS that doesn't rely on this code path.
>
> If we did decide to SRU an emergency fix as a stopgap for MAAS' use
> case, and it's for Trusty only, then why have a test at all? Can we just
> return 'upstart' without a test?
>
> To be clear, I'm not demanding or even requesting anything specific
> right now. I don't feel that a case for urgency has yet been made, given
> the currently known regression timeline. In the meantime I think it's
> worth understanding how we want to fix this properly, because requiring
> multiple SRUs while we swing back and forth is bad for everyone, and
> equally I don't want to see us locked into a suboptimal solution.
>
> If a case is accepted on the basis of urgency, or another SRU team
> member disagrees with my assessment, I wouldn't want to block that. Feel
> free to land what you think is appropriate.
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1732703
>
> Title:
> MAAS does not detect properly if Ubuntu is using upstart/systemd
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/maas/+bug/1732703/+subscriptions
>
> Launchpad-Notification-Type: bug
> Launchpad-Bug: product=maas; status=Won't Fix; importance=Undecided;
> assignee=None;
> Launchpad-Bug: product=maas; productseries=1.9; milestone=1.9.6; status=In
> Progress; importance=Criti...

Read more...

Robie Basak (racb) wrote :

> I would accept a version of this SRU that hard-codes the choice of upstart as the init system on 14.04, because that is the only init system supported in that version of Ubuntu.

OK, that sounds like a reasonable way forward. I've rejected the current upload in the queue, and I believe Andres has agreed to upload a revised version shortly.

Robie Basak (racb) wrote :

We concluded that this isn't a snapd bug, except in so far as it depends on systemd.

Changed in snapd (Ubuntu Trusty):
status: New → Invalid
Andres Rodriguez (andreserl) wrote :

@I've uploaded the new package. I've tested an upgrade to Xenial to ensure there are no issues and confirm it is good to go.

Hello Victor, or anyone else affected,

Accepted maas into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/maas/1.9.5+bzr4599-0ubuntu1~14.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-trusty to verification-done-trusty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-trusty. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in maas (Ubuntu Trusty):
status: Triaged → Fix Committed
tags: added: verification-needed verification-needed-trusty
Ivan Hitos (ivanhitos) wrote :

# VERIFICATION FOR TRUSTY

- Packages
-----------------------------------
# dpkg -l|grep maas
ii maas 1.9.5+bzr4599-0ubuntu1~14.04.3 all MAAS server all-in-one metapackage
ii maas-cli 1.9.5+bzr4599-0ubuntu1~14.04.3 all MAAS command line API tool
ii maas-cluster-controller 1.9.5+bzr4599-0ubuntu1~14.04.3 all MAAS server cluster controller
ii maas-common 1.9.5+bzr4599-0ubuntu1~14.04.3 all MAAS server common files
ii maas-dhcp 1.9.5+bzr4599-0ubuntu1~14.04.3 all MAAS DHCP server
ii maas-dns 1.9.5+bzr4599-0ubuntu1~14.04.3 all MAAS DNS server
ii maas-proxy 1.9.5+bzr4599-0ubuntu1~14.04.3 all MAAS Caching Proxy
ii maas-region-controller 1.9.5+bzr4599-0ubuntu1~14.04.3 all MAAS server complete region controller
ii maas-region-controller-min 1.9.5+bzr4599-0ubuntu1~14.04.3 all MAAS Server minimum region controller
ii python-django-maas 1.9.5+bzr4599-0ubuntu1~14.04.3 all MAAS server Django web framework
ii python-maas-client 1.9.5+bzr4599-0ubuntu1~14.04.3 all MAAS python API client
ii python-maas-provisioningserver 1.9.5+bzr4599-0ubuntu1~14.04.3 all MAAS server provisioning libraries

- Tests
-----------------------------------
Update maas from proposed. Add to /etc/apt/sources.list:
deb [arch=amd64] http://archive.ubuntu.com/ubuntu trusty-proposed multiverse restricted main universe
deb-src [arch=amd64] http://archive.ubuntu.com/ubuntu trusty-proposed multiverse restricted main universe

- DNS+DHCP do work as expected (monitor does not ask systemd)
- Try enlisting -> works
- Try commissioning -> works
- Try deployment -> works
- Try deleting -> works
- Verified that both upgrade and install works:
# apt update + upgrade
# apt install maas

Victor Tapia (vtapia) on 2018-01-16
tags: added: verification-done verification-done-trusty
removed: verification-needed verification-needed-trusty
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package maas - 1.9.5+bzr4599-0ubuntu1~14.04.3

---------------
maas (1.9.5+bzr4599-0ubuntu1~14.04.3) trusty-proposed; urgency=medium

  * Stable Release Update.
    - debian/patches/harcode-upstart-lp1732703.patch: Running snapd or
      livepatch in Trusty installs systemd. Due to a systemd regressions,
      this causes MAAS to incorrectly detect the init system. As such,
      hardcode the init system to upstart (as systemd is not supported
      in Ubuntu) (LP: #1732703).

 -- Andres Rodriguez <email address hidden> Wed, 13 Dec 2017 18:09:11 -0500

Changed in maas (Ubuntu Trusty):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for maas has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemd (Ubuntu Trusty):
status: New → Confirmed
Changed in systemd (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers