jujud does not restart after upgrade-juju on systemd hosts

Bug #1452511 reported by Menno Finlay-Smits
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
Critical
Eric Snow
1.23
Fix Released
Critical
Eric Snow
1.24
Fix Released
Critical
Eric Snow

Bug Description

When "juju upgrade-juju" is issued, agents will download and install the new version and then exit in order to restart into the new tools version. Unfortunately, something about juju's systemd configuration means that jujud is never restarted.

Example (with current juju 1.23):

$ juju bootstrap
...
$ juju upgrade-juju --upload-tools
available tools:
    1.23.3.2-vivid-amd64
best version:
    1.23.3.2
# wait for the agent to exit...

$ sudo systemctl list-units 'juju-agent*'
UNIT LOAD ACTIVE SUB DESCRIPTION
juju-agent-menno-local.service loaded active exited juju agent for machine-0
...

# note that the service is marked as "exited". It never comes back.
# pgrep and "juju status" confirm that jujud is not running.

$ sudo systemctl start 'juju-agent*'
$ sudo systemctl list-units 'juju-agent*'
UNIT LOAD ACTIVE SUB DESCRIPTION
juju-agent-menno-local.service loaded active exited juju agent for machine-0
...

# "systemctl start" doesn't help either

$ sudo systemctl restart 'juju-agent*'
$ sudo systemctl list-units 'juju-agent*'
UNIT LOAD ACTIVE SUB DESCRIPTION
juju-agent-menno-local.service loaded active running juju agent for machine-0

# "systemctl restart" does the trick

$ juju status
environment: local
machines:
  "0":
    agent-state: started
    agent-version: 1.23.3.2
    dns-name: localhost
    instance-id: localhost
    series: vivid
    state-server-member-status: has-vote
services: {}

I'm not sure what the problem is but I would start with the "RemainAfterExit=yes" config option.

Revision history for this message
Menno Finlay-Smits (menno.smits) wrote :

Note that a similar problem happens if I just kill the agent using the kill command. systemd fails to restart the agent in this case. Under upstart the agent would get restarted if this happened.

Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Triaged
Changed in juju-core:
status: Triaged → In Progress
assignee: nobody → Eric Snow (ericsnowcurrently)
Revision history for this message
Eric Snow (ericsnowcurrently) wrote :

For upstart we use the "respawn" stanza. This attempts to restart the service (up to 10 times) if the process exited due to an error (non-0 exit code or "error" signal). The systemd equivalent is "Restart=on-error". We use "Restart=always", though it isn't clear why. "always" will restart the service even for a 0 exit code as well as for sighup, sigint, sigterm, and sigpipe. From what I can tell there aren't any other directives that affect restart.

I'm going to investigate some more.

For reference:

https://wiki.ubuntu.com/SystemdForUpstartUsers#Job_vs._unit_keywords
http://upstart.ubuntu.com/cookbook/#respawn
http://www.freedesktop.org/software/systemd/man/systemd.service.html#Restart=

Revision history for this message
Eric Snow (ericsnowcurrently) wrote :

@Meno: I think you're right about RemainAfterExit.

Revision history for this message
Eric Snow (ericsnowcurrently) wrote :

Yep, RemainAfterExit was the problem. I'm not sure why it was there in the first place. I expect it came from an example I was using as a reference when first working on systemd support in juju. Removing the directive entirely gives us the auto-restart behavior we're expecting.

http://reviews.vapour.ws/r/1616/

Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.