squid3 gets killed at startup with dnsmasq

Bug #978356 reported by Marco Menardi on 2012-04-10
76
This bug affects 13 people
Affects Status Importance Assigned to Milestone
Squid
New
Medium
dnsmasq (Ubuntu)
Undecided
Unassigned
Precise
Undecided
Unassigned
squid3 (Ubuntu)
Medium
Unassigned
Precise
Medium
Unassigned

Bug Description

[ Test case ]
This is difficult to test as it is a race condition, however there are multiple users affected.

1. install squid3 on a system which has a system level network connection
2. reboot
3. on bootup, check to see if squid3 is running (service squid3 status). If it is not running, check /var/log/syslog and the system console for a message about squid3 being killed by SIGHUP.
4. Install updated package
5. repeat step 3, If it is running, this *might* be fixed (but as it is a race, there are no guarantees).

[ Regression Potential ]
This update only touches the upstart job, so it will only affect that. Upstart's respawn code is fairly conservative, and the upstart job is fairly straight forward. Definitely a few reboots with updates installed should prove this update regression-free.

-------------

I've 12.04 - 32bit pae, and removed networkmanager (aptitude purge) since I'm using it as a LTSP server.
squid3 3.1.19-1ubuntu1, dnsmasq 2.59-4
Linux gnusc 3.2.0-22-generic-pae #35-Ubuntu SMP Tue Apr 3 20:37:36 UTC 2012 i686 athlon i386 GNU/Linux

I install dnsmasq and squid3. I've noticed that squid3 was never working at boot.
dmesg shows a:
root@gs1204:~# dmesg | grep squid
[ 20.227964] init: squid3 main process (1310) killed by HUP signal

If I remove dnsmasq (aptitude purge) squid has no problem. Don't know if is related to the recent mechanism so resolv.conf is automatically updated by dnsmasq or not.
At the moment I've setup the workaround of run squid3 from rc.local, but of course is not a clean solution

James Page (james-page) on 2012-04-12
Changed in squid3 (Ubuntu):
importance: Undecided → Low
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in squid3 (Ubuntu):
status: New → Confirmed

I experience the same issue, but have NetworkManager installed - although it does not manage the eth0 interface (I am using guessnet for that).

summary: - squid3 crash at startup with dnsmasq and no networkmanager
+ squid3 gets killed at startup with dnsmasq and no networkmanager
Clint Byrum (clint-fewbar) wrote :

So I think what is happening is that the network interface is coming up before squid has installed a handler for SIGHUP.

*tons* of stuff happens between squid starting, and squid installing its signal handlers. I haven't gone through all of the code but I'm betting at least one or two stall on not having a network yet. It would seem that squid should probably ignore SIGHUP almost immediately on startup to avoid this issue.

I've pushed up a branch that makes that change, and I'll put up packages in a PPA soon that affected users can test.

Changed in dnsmasq (Ubuntu):
status: New → Invalid
Daniel Hahler (blueyed) wrote :

Where does the HUP signal come from?
Could "squid3 -k reconfigure" get used there instead? (if it's specific to squid3)

(I assume you've meant that squid's handler is being installed to late to map it to "reload config" instead)

Clint Byrum (clint-fewbar) wrote :

squid3 includes a resolvconf script to run whenever DNS servers change:

#!/bin/sh

PATH="/usr/sbin:/usr/bin:/sbin:/bin"

# Make squid aware of changes to resolv.conf
if status squid3 | grep "start/running" > /dev/null; then
 reload squid3
fi

reconfigure also ends a HUP:

            if (!strncmp(optarg, "reconfigure", strlen(optarg)))
                /** \li On reconfigure send SIGHUP. */
                opt_send_signal = SIGHUP;

And yes, thats what I mean, there's a lag between when squid might receive the HUP, and when it will handle it properly. I'd rather ignore that HUP than die, since most of what it is doing at that exact time is reading its configuration and then attempting DNS lookups. I haven't looked deep enough into the code yet to see if it could handle the full reconfiguration earlier.

Daniel Hahler (blueyed) wrote :

Do you think adding a workaround like "sleep 5" above the "reload" into the resolvconf script makes sense and is good enough for a workaround in Ubuntu Precise?
Would it be possible to add a loop to detect if the SIGHUP handler is properly setup already (and send HUP only after that)?
Another idea might be to "restart" instead of "reload".

Daniel Hahler (blueyed) wrote :

Sorry, I've just now recognized that you have a patch already, and so any workarounds might be void.
Have you uploaded it to a PPA already?

Do you want to propose this patch upstream in Squid's Bugzilla?
This would allow to gain feedback on it before including it in Ubuntu.

Clint Byrum (clint-fewbar) wrote :

I haven't had a chance to test it locally yet, other than building the package. I was hoping to simulate the problem. Hopefully will test in a few days.

TJ (tj) wrote :

I'm seeing a similar crash on a 12.04 32-bit server in similar circumstances, however the error is different. Before I open a separate bug report I thought I'd check with Marco if he sees the following reports in "/var/log/squid3/access.log.1" ?

11:12:24 | Reconfiguring Squid Cache (version 3.1.19)...
11:12:24 | FD 14 Closing HTTP Connection
11:12:24 | assertion failed: disk.cc377: "fd >= 0"

I see this in dmesg:

init: squid3 main process (3562) killed by ABRT signal

TJ (tj) wrote :

Oops, correction! the error message is in /var/log/squid3/cache.log.1.

I've found an upstream bug that matches the symptoms I see exactly so I've created a new report bug #988802 "squid3 killed by ABRT signal. assertion failed: disk.cc377: "fd >= 0" "

Wladimir Mutel (mwg) wrote :

Have the same behaviour on a x86_64 server freshly-upgraded from Oneiric to Precise.
On system reboot, squid3 does not start properly - gets killed by HUP signal as recorded in syslog.
Logging into the system by ssh and manual starting service squid3 helps to resolve the problem.
But having squid3 starting normally on bootup would be very beneficial.

I strongly suspect this misbehaviour happens due to incomplete integration of upstart, ifupdown and resolvconf packages functionality. I use system without NetworkManager (it's a server after all), my interfaces are controlled by ifupdown and by ppp scripts. Probably upstart starts ifupdown too early, again as it usually happens when you have resolvsconf in universe instead of main.

Wladimir Mutel (mwg) wrote :

And btw, my DNS server on this host is bind9, not dnsmasq.

Wladimir Mutel (mwg) wrote :

And yes, I reproduced this on i386 as well as on x86_64

Daniel Hahler (blueyed) wrote :

For what it's worth, this does not happen with the squid-deb-proxy package (which uses squid with an optimized config) - at least after several boot processes.

The init/start script is different, for example, it uses
    "start on (local-filesystems and net-device-up IFACE!=lo)"
instead of
    "start on runlevel [2345]".

Excerpts from Daniel Hahler's message of Tue May 15 08:53:06 UTC 2012:
> For what it's worth, this does not happen with the squid-deb-proxy
> package (which uses squid with an optimized config) - at least after
> several boot processes.
>
> The init/start script is different, for example, it uses
> "start on (local-filesystems and net-device-up IFACE!=lo)"

This is basically wrong. A system may have more than one interface and
there is no guarantee that one of them is "the right one".

That said, this makes perfect sense, because with NM, this will delay the
start until *after* the first interface comes up, whereas with runlevel 2,
that event will actually come before the first interface is up, so that
starts the window where squid is starting up, polling for DNS entries
and failing, and then it gets HUP just as the interface comes up.

My patch to ignore HUP early on should solve this issue, but I haven't
pushed it back upstream yet.

I'm running 12.04 precise 64bit (upgraded from oneric) squid3 (ver 3.1.19) and bind9 and dansguardian. Was working fine until an hour ago when I apt-get update and apt-get upgrade. Now squid3 is killed by HUP (so says syslog) and I have to restart squid3 to get everything working.

Clint Byrum (clint-fewbar) wrote :

Here is the patch I have come up with, very simple, but I'm not sure of the full ramifications:

tags: added: patch
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package squid3 - 3.1.19-1ubuntu5

---------------
squid3 (3.1.19-1ubuntu5) quantal; urgency=low

  * d/squid3.upstart: Work around squid not handling SIGHUP by
    adding respawn to upstart job. (LP: #978356)
 -- Clint Byrum <email address hidden> Tue, 19 Jun 2012 15:35:19 -0700

Changed in squid3 (Ubuntu):
status: Confirmed → Fix Released
description: updated
Changed in squid3 (Ubuntu):
importance: Low → Medium
Changed in squid3 (Ubuntu Precise):
status: New → Triaged
importance: Undecided → Medium
Changed in dnsmasq (Ubuntu Precise):
status: New → Invalid
Robert Collins (lifeless) wrote :

the squid patch looks reasonable to me; please forward to the squid-dev list with a [PATCH] subject.

Hello Marco, or anyone else affected,

Accepted squid3 into precise-proposed. The package will build now and be available in a few hours. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in squid3 (Ubuntu Precise):
status: Triaged → Fix Committed
tags: added: verification-needed

Fixed for me:

Jun 23 00:04:44 server kernel: [ 14.922376] init: squid3 main process (1148) killed by ABRT signal
Jun 23 00:04:44 server kernel: [ 14.922405] init: squid3 main process ended, respawning

tags: added: verification-done
removed: verification-needed
James Page (james-page) on 2012-06-26
Changed in squid3 (Ubuntu Precise):
milestone: none → ubuntu-12.04.1
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package squid3 - 3.1.19-1ubuntu3.12.04.1

---------------
squid3 (3.1.19-1ubuntu3.12.04.1) precise-proposed; urgency=low

  * d/squid3.upstart: Work around squid not handling SIGHUP by
    adding respawn to upstart job. (LP: #978356)
 -- Clint Byrum <email address hidden> Tue, 19 Jun 2012 16:32:10 -0700

Changed in squid3 (Ubuntu Precise):
status: Fix Committed → Fix Released
Marco Menardi (mmenaz) wrote :

Sorry, I've been too busy / other problems since today.
I've upgraded the installation that was affected (3.1.19-1ubuntu3.12.04.1 has been installed), removed the workaround in rc.conf, rebooted and works fine now!
Thanks a lot for the fix

Changed in squid:
importance: Unknown → Medium
status: Unknown → New
Thomas Hood (jdthood) on 2014-12-11
summary: - squid3 gets killed at startup with dnsmasq and no networkmanager
+ squid3 gets killed at startup with dnsmasq
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.