ovn-host service stop doesn't clean up all ovn-controller pids

Bug #1913736 reported by Corey Bryant
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ovn (Ubuntu)
New
Undecided
Unassigned

Bug Description

I noticed this issue with the ovn-chassis charm's pause/resume tests failing. Pause will stop the ovn-host service and the test will check if any ovn-controller pids exist after the stop, and in fact they did still exist.

Stopping and starting ovn-host service is fine if openvswitch-switch hasn't been restarted. The ovn-controller pids get cleaned up when ovn-host is stopped. For example:

https://paste.ubuntu.com/p/gq9VXWN25y/
The same details are in the attached stop-start-good.txt file.

However, if openvswitch-switch is restarted prior to the stop/start of ovn-host, it seems to cause issues where ovn-controller pids don't get cleaned up. For example:

https://paste.ubuntu.com/p/Tn6wzY7gWH/
The same details are in the attached stop-start-bad.txt file.

I haven't looked into it much but wondering if it may be an issue in the upstream utilities/ovn-ctl script.

Revision history for this message
Corey Bryant (corey.bryant) wrote :
Revision history for this message
Corey Bryant (corey.bryant) wrote :
description: updated
Revision history for this message
Corey Bryant (corey.bryant) wrote :

It seems this will continue to leak processes every time that openvswitch-switch gets restarted. For example (and same details are in https://paste.ubuntu.com/p/jHs5SmZVPS/):

Note the message: Jan 29 13:54:30 juju-42c75d-zaza-ab85a8c30e4d-1 systemd[1]: ovn-host.service: Found left-over process 26159 (ovn-controller) in control group while starting unit. Ignoring.

● ovn-host.service - LSB: OVN host components
   Loaded: loaded (/etc/init.d/ovn-host; generated)
   Active: active (running) since Fri 2021-01-29 13:54:30 UTC; 12s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 28042 ExecStop=/etc/init.d/ovn-host stop (code=exited, status=0/SUCCESS)
  Process: 28057 ExecStart=/etc/init.d/ovn-host start (code=exited, status=0/SUCCESS)
    Tasks: 15 (limit: 2361)
   CGroup: /system.slice/ovn-host.service
           ├─26158 ovn-controller: monitoring pid 26159 (healthy)
           ├─26159 ovn-controller unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --no-chdir --log-file=/var/log/openvswitch/ovn-controller.log --pidfile=/var/run/openvswitch/ovn-controller.pid --detach --monitor
           ├─27816 ovn-controller: monitoring pid 27817 (healthy)
           ├─27817 ovn-controller unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --no-chdir --log-file=/var/log/openvswitch/ovn-controller.log --pidfile=/var/run/openvswitch/ovn-controller.pid --detach --monitor
           ├─28073 ovn-controller: monitoring pid 28074 (healthy)
           └─28074 ovn-controller unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --no-chdir --log-file=/var/log/openvswitch/ovn-controller.log --pidfile=/var/run/openvswitch/ovn-controller.pid --detach --monitor

Jan 29 13:54:30 juju-42c75d-zaza-ab85a8c30e4d-1 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 29 13:54:30 juju-42c75d-zaza-ab85a8c30e4d-1 systemd[1]: ovn-host.service: Found left-over process 26159 (ovn-controller) in control group while starting unit. Ignoring.
Jan 29 13:54:30 juju-42c75d-zaza-ab85a8c30e4d-1 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 29 13:54:30 juju-42c75d-zaza-ab85a8c30e4d-1 systemd[1]: ovn-host.service: Found left-over process 27816 (monitor) in control group while starting unit. Ignoring.
Jan 29 13:54:30 juju-42c75d-zaza-ab85a8c30e4d-1 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 29 13:54:30 juju-42c75d-zaza-ab85a8c30e4d-1 systemd[1]: ovn-host.service: Found left-over process 27817 (ovn-controller) in control group while starting unit. Ignoring.
Jan 29 13:54:30 juju-42c75d-zaza-ab85a8c30e4d-1 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 29 13:54:30 juju-42c75d-zaza-ab85a8c30e4d-1 systemd[1]: Starting LSB: OVN host components...
Jan 29 13:54:30 juju-42c75d-zaza-ab85a8c30e4d-1 ovn-host[28057]: * Starting ovn-controller
Jan 29 13:54:30 juju-42c75d-zaza-ab85a8c30e4d-1 systemd[1]: Started LSB: OVN host components.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.