jujud does not restart after upgrade-juju on systemd hosts

Bug #1452511 reported by Menno Finlay-Smits on 2015-05-07
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Critical
Eric Snow
1.23
Critical
Eric Snow
1.24
Critical
Eric Snow

Bug Description

When "juju upgrade-juju" is issued, agents will download and install the new version and then exit in order to restart into the new tools version. Unfortunately, something about juju's systemd configuration means that jujud is never restarted.

Example (with current juju 1.23):

$ juju bootstrap
...
$ juju upgrade-juju --upload-tools
available tools:
    1.23.3.2-vivid-amd64
best version:
    1.23.3.2
# wait for the agent to exit...

$ sudo systemctl list-units 'juju-agent*'
UNIT LOAD ACTIVE SUB DESCRIPTION
juju-agent-menno-local.service loaded active exited juju agent for machine-0
...

# note that the service is marked as "exited". It never comes back.
# pgrep and "juju status" confirm that jujud is not running.

$ sudo systemctl start 'juju-agent*'
$ sudo systemctl list-units 'juju-agent*'
UNIT LOAD ACTIVE SUB DESCRIPTION
juju-agent-menno-local.service loaded active exited juju agent for machine-0
...

# "systemctl start" doesn't help either

$ sudo systemctl restart 'juju-agent*'
$ sudo systemctl list-units 'juju-agent*'
UNIT LOAD ACTIVE SUB DESCRIPTION
juju-agent-menno-local.service loaded active running juju agent for machine-0

# "systemctl restart" does the trick

$ juju status
environment: local
machines:
  "0":
    agent-state: started
    agent-version: 1.23.3.2
    dns-name: localhost
    instance-id: localhost
    series: vivid
    state-server-member-status: has-vote
services: {}

I'm not sure what the problem is but I would start with the "RemainAfterExit=yes" config option.

Note that a similar problem happens if I just kill the agent using the kill command. systemd fails to restart the agent in this case. Under upstart the agent would get restarted if this happened.

Curtis Hovey (sinzui) on 2015-05-07
Changed in juju-core:
status: New → Triaged
Changed in juju-core:
status: Triaged → In Progress
assignee: nobody → Eric Snow (ericsnowcurrently)
Eric Snow (ericsnowcurrently) wrote :

For upstart we use the "respawn" stanza. This attempts to restart the service (up to 10 times) if the process exited due to an error (non-0 exit code or "error" signal). The systemd equivalent is "Restart=on-error". We use "Restart=always", though it isn't clear why. "always" will restart the service even for a 0 exit code as well as for sighup, sigint, sigterm, and sigpipe. From what I can tell there aren't any other directives that affect restart.

I'm going to investigate some more.

For reference:

https://wiki.ubuntu.com/SystemdForUpstartUsers#Job_vs._unit_keywords
http://upstart.ubuntu.com/cookbook/#respawn
http://www.freedesktop.org/software/systemd/man/systemd.service.html#Restart=

Eric Snow (ericsnowcurrently) wrote :

@Meno: I think you're right about RemainAfterExit.

Eric Snow (ericsnowcurrently) wrote :

Yep, RemainAfterExit was the problem. I'm not sure why it was there in the first place. I expect it came from an example I was using as a reference when first working on systemd support in juju. Removing the directive entirely gives us the auto-restart behavior we're expecting.

http://reviews.vapour.ws/r/1616/

Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui) on 2015-05-20
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers