I believe I have figured this out, and it's not a problem with Mailman. spm helped me greatly with examining the running production system on forster, and here is what I believe is happening. Mailman is actually working exactly as expected. The fact that you get a "Starting Mailman" and "Shutting down Mailman" message with a traceback in between is a red herring, as is the fact that you see "mailman" in the pid file name in the traceback. What is happening is this. The initscript calls "make start" which calls bin/run, which calls runlaunchpad.py, which examines the config files to determine which services to start. In the production-mailman/launchpad-lazr.conf file, we tell it to start Mailman. Which it does perfect. Then something goes wrong[1] and the normal Launchpad shutdown procedure takes over, which correctly shuts down Mailman. Mailman does exactly the right thing here. Let's look at that traceback again: Traceback (most recent call last): File "runlaunchpad.py", line 60, in ? run() File "runlaunchpad.py", line 56, in run start_launchpad(argv) File "/srv/lists.launchpad.net/production/launchpad-rev-7667/lib/canonical/launchpad/scripts/runlaunchpad.py", line 237, in start_launchpad make_pidfile('launchpad') File "/srv/lists.launchpad.net/production/launchpad-rev-7667/utilities/../lib/canonical/lazr/pidfile.py", line 34, in make_pidfile raise RuntimeError("PID file %s already exists. Already running?" % RuntimeError: PID file /srv/launchpad.net/var/production-mailman-launchpad.pid already exists. Already running? You think Mailman's involved because you see 'production-mailman-launchpad.pid' there, but it's not! That pid file is named after the LPCONFIG variable and config directory that's being used, which for forster is... production-mailman. In fact, Mailman's pid file is managed by mailmanctl, not by lazr/pidfile.py, so this cannot be referring to Mailman's pid file. It's referring to a Launchpad instance. spm confirmed this by cat'ing two pid files on forster. /srv/lists.launchpad.net/var/mailman/data/master-qrunner.pid pointed to the mailmanctl master qrunner, humming along perfectly. /srv/launchpad.net/var/production-mailman-launchpad.pid pointed to a running bin/run -i process, in other words, a running zope instance. So 'make start' appears to start both Mailman and an appserver, and it's this latter that fails because of the pre-existing pid file. A little spelunking in runlaunchpad.py and Zope seems to indicate that an appserver is unconditionally started by 'make start'. There appears to be no way to prevent that, so if a previous crash left a trash Zope pidfile, you would see exactly the error we're seeing. Mailman is nicely cleaning up it's trash, but Launchpad/Zope isn't :) To fix this, I think we need to modify runlaunchpad.py, inside start_launchpad() so that Zope's main() isn't called unconditionally. It should probably consult a config file option before it starts Zope. Then, we would update the production-mailman/launchpad-lazr.conf file to disable the appserver. I'm kicking this over to Francis since it seems like more of a Foundations issue. Francis, if you just want to verify the analysis and have me do the work to fix it, kick it back my way. I don't think it's a lot of work to fix.