squid FATAL: check failed: ios[0].iov_len == sizeof(data)#012 exception location: TypedMsgHdr.cc(70) sync#012

Bug #1815852 reported by Haw Loeung on 2019-02-14
26
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Squid
Unknown
Unknown
squid (Ubuntu)
Undecided
Unassigned
Cosmic
High
Andreas Hasenack

Bug Description

[Impact]
Squid doesn't work if "workers" is set in squid.conf.

[Test Case]
* install squid
sudo apt update && sudo apt install squid -y

* apply workaround for #1816006
sudo mkdir -p /var/run/squid
sudo chown proxy:proxy /var/run/squid

* confirm no FATAL errors in logs so far
sudo grep FATAL /var/log/squid/cache.log
(empty)

* confirm squid working:
http_proxy=http://localhost:3128/ wget http://www.ubuntu.com -O /dev/null
echo $?
0

* add workers config
echo "workers 5" | sudo tee -a /etc/squid/squid.conf

* to speed the test up a bit, reduce the shutdown timeout
echo "shutdown_lifetime 5 seconds" | sudo tee -a /etc/squid/squid.conf

* restart squid
sudo systemctl restart squid

* verify new FATAL error in logs
sudo grep FATAL /var/log/squid/cache.log
2019/02/28 17:05:04 kid2| FATAL: check failed: ios[0].iov_len == sizeof(data)
2019/02/28 17:05:04 kid4| FATAL: check failed: ios[0].iov_len == sizeof(data)
(...)

* verify squid not working
$ http_proxy=http://localhost:3128/ wget http://www.ubuntu.com -O /dev/null
(...)
Connecting to localhost (localhost)|127.0.0.1|:3128... failed: Connection refused.
$ echo $?
4

With the updated packages, there is (new) FATAL error in the logs, and wget works just fine with the workers setting.

[Regression Potential]
The fix applies on top of another fix that was used to fix a FTBFS with gcc-8. The full gcc-8 set of fixes is quite big and some are still open, like https://github.com/squid-cache/squid/pull/270, and triggered some refactoring upstream. That being said, this particular change was committed many months ago (July 2018) and is already included in Disco in the newer upstream release we ship there.

[Other Info]
The fix applies on top of another patch from Debian. I chose to use a new patch instead of changing the existing one in order to simplify things, keep a clear dep3 header, and ease maintenance.

There is another pending SRU for squid in bug #1816006, and the test instructions above include a workaround to not hit that bug, otherwise we wouldn't even hit the condition where this bug here occurs. Both will be fixed with one upload,

[Original Description]
Hi,

It seems squid fails when configured with SMP. Steps I took to reproduce:

  * spin up a fresh cosmic instance

  * add 'workers 5' to /etc/squid/squid.conf

  * sudo service squid restart

Logging shows this:

| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Starting Squid Cache version 4.1 for x86_64-pc-linux-gnu...
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Service Name: squid
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Process ID 10728
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Process Roles: worker
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: With 1024 file descriptors available
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Initializing IP Cache...
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: DNS Socket created at [::], FD 5
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: DNS Socket created at 0.0.0.0, FD 13
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Adding nameserver 127.0.0.53 from /etc/resolv.conf
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Adding domain lcy02.canonistack from /etc/resolv.conf
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10726]: Local cache digest enabled; rebuild/rewrite every 3600/3600 sec
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10726]: Store logging disabled
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10726]: Swap maxSize 0 + 262144 KB, estimated 20164 objects
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10726]: Target number of buckets: 1008
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10726]: Using 8192 Store buckets
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10726]: Max Mem size: 262144 KB [shared]
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10726]: Max Swap size: 0 KB
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10726]: Using Least Load store dir selection
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10726]: Set Current Directory to /var/spool/squid
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Logfile: opening log daemon:/var/log/squid/access.log
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Logfile Daemon: opening log /var/log/squid/access.log
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10723]: Finished loading MIME types and icons.
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10723]: FATAL: check failed: ios[0].iov_len == sizeof(data)#012 exception location: TypedMsgHdr.cc(70) sync#012
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10657]: Squid Parent: squid-1 process 10723 exited with status 255
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10657]: Squid Parent: squid-1 process 10723 will not be restarted for 3600 seconds due to repeated, frequent failures
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Local cache digest enabled; rebuild/rewrite every 3600/3600 sec
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10726]: Finished loading MIME types and icons.
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Store logging disabled
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10726]: FATAL: check failed: ios[0].iov_len == sizeof(data)#012 exception location: TypedMsgHdr.cc(70) sync#012
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Swap maxSize 0 + 262144 KB, estimated 20164 objects
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Target number of buckets: 1008
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Using 8192 Store buckets
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Max Mem size: 262144 KB [shared]
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Max Swap size: 0 KB
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Using Least Load store dir selection
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Set Current Directory to /var/spool/squid
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10657]: Squid Parent: squid-4 process 10726 exited with status 255
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10657]: Squid Parent: squid-4 process 10726 will not be restarted for 3600 seconds due to repeated, frequent failures
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: Finished loading MIME types and icons.
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10728]: FATAL: check failed: ios[0].iov_len == sizeof(data)#012 exception location: TypedMsgHdr.cc(70) sync#012
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10657]: Squid Parent: squid-3 process 10728 exited with status 255
| Feb 14 03:39:56 juju-be3f45-default-20 squid[10657]: Squid Parent: squid-3 process 10728 will not be restarted for 3600 seconds due to repeated, frequent failures
| Feb 14 03:39:57 juju-be3f45-default-20 squid[10659]: storeLateRelease: released 0 objects

| ubuntu@juju-be3f45-default-20:~$ apt-cache policy squid
| squid:
| Installed: 4.1-1ubuntu3
| Candidate: 4.1-1ubuntu3
| Version table:
| *** 4.1-1ubuntu3 500
| 500 http://nova.clouds.archive.ubuntu.com/ubuntu cosmic/main amd64 Packages
| 100 /var/lib/dpkg/status

Related branches

Haw Loeung (hloeung) on 2019-02-14
description: updated
Haw Loeung (hloeung) wrote :

FYI, works fine using Disco. Same config.

| ubuntu@juju-be3f45-default-21:~$ apt-cache policy squid
| squid:
| Installed: 4.4-1ubuntu1
| Candidate: 4.4-1ubuntu1
| Version table:
| *** 4.4-1ubuntu1 500
| 500 http://nova.clouds.archive.ubuntu.com/ubuntu disco/main amd64 Packages
| 100 /var/lib/dpkg/status

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in squid (Ubuntu Cosmic):
status: New → Confirmed
Changed in squid (Ubuntu):
status: New → Confirmed

Issue confirmed - Works in <=Bionic, works in >=Disco - setting tasks

Changed in squid (Ubuntu):
status: Confirmed → Fix Released

Full logs of good&bad case attached

@hoelung - there isn't an obvious change in https://github.com/squid-cache/squid.git nor in http://www.squid-cache.org/Versions/v4/changesets/.
Do you think you could bisect that down to the actual change or would you rely on someone else to do that?

tags: added: server-next

FYI - other workers related bug 1816006 - that one is fixed since Disco and needs an SRU. Those two probably should be handled together.

Changed in squid (Ubuntu Cosmic):
status: Confirmed → Triaged
Haw Loeung (hloeung) wrote :

@paelzer, I'm not able to bisect that down at this stage and would rather rely on someone else to do that.

I don't think LP: #1816006 is related though since that one still applies to Disco and this one is fixed with squid 4.4 shipped in Disco.

On Tue, Feb 19, 2019 at 11:45 PM Haw Loeung <email address hidden> wrote:
>
> @paelzer, I'm not able to bisect that down at this stage and would
> rather rely on someone else to do that.

Ok, thanks for letting us know Haw.

> I don't think LP: #1816006 is related though since that one still
> applies to Disco and this one is fixed with squid 4.4 shipped in Disco.

It is related in a way that the same setup and config triggers it - so
testing and verification could as well be the same.
I agree that the root cause and fix for it is not the same, therefore
they are only related but not a duplicate.

Dean Henrichsmeyer (dean) wrote :

~ubuntu-server can handle the bisection and get this sorted.

Changed in squid (Ubuntu Cosmic):
importance: Undecided → High
Changed in squid (Ubuntu Cosmic):
assignee: nobody → Andreas Hasenack (ahasenack)
Haw Loeung (hloeung) wrote :

Per the linked bug report from upstream squid, it looks like this might be the patchset with the fix:

https://github.com/squid-cache/squid/pull/242/files

Andreas Hasenack (ahasenack) wrote :

Thanks Haw! I'll test this next

description: updated
Changed in squid (Ubuntu Cosmic):
status: Triaged → In Progress
description: updated
Andreas Hasenack (ahasenack) wrote :

Haw, here is a ppa with test packages for this bug:

sudo add-apt-repository ppa:ahasenack/squid-smp-1815852

It doesn't yet have the fix for the /var/run/squid missing directory, I'll bring that back into cosmic next.

description: updated
description: updated
description: updated
Andreas Hasenack (ahasenack) wrote :

https://launchpad.net/~ahasenack/+archive/ubuntu/squid-smp-1815852/+packages has test packages for cosmic with both fixes now (#1816006 and #1815852).

Reviewed the MP, LGTM - thanks for identifying that fix Andreas!

Hello Haw, or anyone else affected,

Accepted squid into cosmic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/squid/4.1-1ubuntu3.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-cosmic to verification-done-cosmic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-cosmic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in squid (Ubuntu Cosmic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-cosmic
Haw Loeung (hloeung) wrote :

Confirmed fixed in the -proposed package. Thanks!

What I did was what was outlined in the test case section. Also confirmed squid is listening for requests:

| 2019/03/05 22:32:25 kid5| Accepting HTTP Socket connections at local=[::]:3128 remote=[::] FD 15 flags=1
| 2019/03/05 22:32:25 kid1| Accepting HTTP Socket connections at local=[::]:3128 remote=[::] FD 15 flags=1

And of course, wget, per test case, works as well.

tags: added: verification-done verification-done-cosmic
removed: verification-needed verification-needed-cosmic
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package squid - 4.1-1ubuntu3.1

---------------
squid (4.1-1ubuntu3.1) cosmic; urgency=medium

  * d/p/smp-startup-error.patch: fix startup error in SMP mode, when workers
    is set. (LP: #1815852)
  * d/squid.tmpfile: add tmpfiles configuration to handle /var/run/squid
    at boot. Thanks to Luigi Gangitano <email address hidden> (LP: #1816006)

 -- Andreas Hasenack <email address hidden> Thu, 28 Feb 2019 13:27:11 -0300

Changed in squid (Ubuntu Cosmic):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for squid has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.