Segmentation fault at large expression lists

Bug #316816 reported by Velnias on 2009-01-13
2
Affects Status Importance Assigned to Milestone
squidguard (Ubuntu)
Undecided
Joachim Wiedorn

Bug Description

Some information on the concerning package:

Package: squidguard
Status: install ok installed
Priority: optional
Section: web
Installed-Size: 448
Maintainer: Ubuntu MOTU Team <email address hidden>
Architecture: amd64
Version: 1.2.0-8.2ubuntu2
Depends: debconf (>= 0.5) | debconf-2.0, libc6 (>= 2.7-1), libdb4.6, liburi-perl, libwww-perl, perl, squid3 | squid
Suggests: chastity-list
Conffiles:
 /etc/squid/squidGuard.conf 1bbe05151e051355c9ef676d7f27c19d

Description: Ubuntu 8.04.2
Release: 8.04
Architecture: 64bit (AMD)

I tried to add the most recent easylist-adblock-Filterset which contained some thousand expressions.
Matching regexes were correctly rewritten, but a non matching URL causes squidGuard to crash with an segmentation fault.

Attached is the filterset I used and under the sed-script I used to transform:

/@@.*/d;
/^!.*/d;
/^\[.*\]$/d;
s#http://##g;
s,[.?=&/|],\\&,g;
s#*#.*#g;
s,\$.*$,,g;

Brian (brian2004) wrote :

I just run into this with SquidGuard 1.4 on another platform. I noticed that trimming the easylist expression list to exactly 1000 lines avoids the segmentation fault.

Brian (brian2004) wrote :

The sed script isn't doing enough. Try this:

/@@.*/d;
/^!.*/d;
/^\[.*\]$/d;
s#http://#^#g;
s,[.?=&/|()[],\\&,g;
s#*#.*#g;
s,\$.*$,,g;

It includes code to escape parenthesis and open square bracket, and it replaces http:// with ^

I don't know if a simple sed script is going to be enough to get the easylist working with SquidGuard.

Hello,

Current adblock list add new patterns that makes squidguard segfaulting again.

Here is the script I use:

#!/bin/sh

temp=`mktemp`
wget --no-check-certificate -O ${temp} http://easylist.adblockplus.org/adblock_rick752.txt

mkdir -p /var/lib/squidguard/db/adblock
rm -f /var/lib/squidguard/db/adblock/expressions

sed -e '/@@.*/d' -e '/^!.*/d' -e '/^\[.*\]$/d' -e 's#http://#^#g' -e 's,[.?=&/|()[],\\&,g' -e 's#*#.*#g' -e 's,\$.*$,,g' ${temp} > /var/lib/squidguard/db/adblock/expressions
sed -i -e 's/^-/\\-/' -e 's/^\+/\\+/' /var/lib/squidguard/db/adblock/expressions

echo 'deezer\.purl\.fr\/php\/zone\/ad' >> /var/lib/squidguard/db/adblock/expressions

update-squidguard

chown proxy.proxy -R /var/lib/squidguard/db/adblock

rm -f ${temp}

The new line are:
+adverts\/
-ad\.html
-banner-ads\/

That are turned into (with the second sed:)
\+adverts\/
\-ad\.html
\-banner-ads\/

Regards, Adam.

bawitdaba (bawitdaba) wrote :

Hey I just wanted to say thanks this helped a lot running squidGuard 1.5-static, but there is also one new problem I found after a half hour of lots and lots of segfaulting....

In the latest easylist there are two lines with \9 in them, for whatever reason this caused squidGuard to segfault I had to remove those two lines for it to work.

To the person above that said split or only use the first 1000 lines, while that solution might work for you it's only because there is an issue with the file your using, if you find the problem and fix it then squidGuard can handle all of the 14k+ of regex filters just fine.

- Nick

James T. Kirk (vanquish2) wrote :

Nick (bawitdaba),

I think I'm running into the same issue as you with the \9 lines. Could you provide a little more information on how you found them and removed them from the expressionslist?

Much appreciated,

~ James

James,

\9 is perhaps interpreted as a back-reference.

Try to add one of those to your sed script:

A crude removal of the lines:
/.*\\9.*/d;

or just escaping (a little more future proof as far as back-references are concerned)
s/\\\([0-9]\)/\\\\\1/g;

Should do the trick.

Happy filtering.

Cheers.
- Fabrice.

Joachim Wiedorn (ad-debian) wrote :

Problem is user specific - resolved.

Changed in squidguard (Ubuntu):
assignee: nobody → Joachim Wiedorn (ad-debian)
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers