service restart fails due to TIMED_WAIT and lack of SO_REUSEADDR

Bug #1975880 reported by Clifford Heath
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
mosquitto (Ubuntu)
New
Undecided
Unassigned

Bug Description

Ubuntu 18.04LTS clean install from Azure VM image.
mosquitto:
  Installed: 1.4.15-2ubuntu0.18.04.3
  Candidate: 1.4.15-2ubuntu0.18.04.3
  Version table:
 *** 1.4.15-2ubuntu0.18.04.3 500
        500 http://azure.archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages
        500 http://azure.archive.ubuntu.com/ubuntu bionic-security/universe amd64 Packages
        100 /var/lib/dpkg/status
     1.4.15-2 500
        500 http://azure.archive.ubuntu.com/ubuntu bionic/universe amd64 Packages

$ telnet localhost 1883
Trying 127.0.0.1...
Connected to localhost.
[exit]
$ service mosquitto restart
$ telnet localhost 1883
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused

[ First problem here: the service has not restarted. The log says "Address in use"]

$ service mosquitto start
$ telnet localhost 1883
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused

[ Second problem, service start doesn't do anything (because it thinks it already restarted the service, even though there is no pid file?). There is no new "Address in use" log message]

$ sleep 120
$ service mosquitto start
$ telnet localhost 1883
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused

[ Third problem, even after the TIMED_WAIT the service does not always restart ]

$ service mosquitto restart
$ telnet localhost 1883
Trying 127.0.0.1...
Connected to localhost.
...

[ Stopping and restarting the service does work, but only after the TIMED_WAIT interval ]

systemd seems to be confused into thinking that it doesn't need to start the server, even though the previous attempt to start it failed.

Using "service stop" then start, or "service restart" gets it out of this confusion, even though there was no service to stop.

When the service exits, the port is left in the TIMED_WAIT state for the 2 minute timeout.
An attempt to restart it fails, because the new socket doesn't use setsockopt with SO_REUSEADDR and so the port cannot yet be bound again to a new socket.

A google search says SO_REUSEADDR was removed in 2013 "because it worked differently under Windows".

I don't know if it would have helped, but it appears that mosquitto is using an /etc/init.d/mosquitto file, and not a .system file under /etc/systemd/system/

This means it was not built with the configuration WITH_SYSTEMD, contrary to the advice here:
<https://sources.debian.org/src/mosquitto/2.0.11-1.1/service/systemd/README/>

Revision history for this message
Gianfranco Costamagna (costamagnagianfranco) wrote :

This might be not an issue anymore in mantic, 2.0.18 has WITH_SYSTEMD=ON at least on linux

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.