Or rather, its not entirely apache. let me expand.
We see a 404 on the files that choose to break. The 404 has bad content-encoding headers, and we should start doing those headers in the librarian not in apache. I'll file a separate bug on that.
the librarian log shows
GET /35262589/buildlog_ubuntu-karmic-i386.openproj_1.4-2+px1_FAILEDTOBUILD.txt.gz?token=
(note the %2B is gone, + is present instead)
Or rather, its not entirely apache. let me expand.
We see a 404 on the files that choose to break. The 404 has bad content-encoding headers, and we should start doing those headers in the librarian not in apache. I'll file a separate bug on that.
The failing URLs all have % escaped characters. The tokens that are being stored have buildlog_ ubuntu- karmic- i386.openproj_ 1.4-2%2Bpx1_ FAILEDTOBUILD. txt.gz /launchpad. net/~soyuz- team/+archive/ ppa/+build/ 1334368/ +files/ buildlog_ ubuntu- karmic- i386.openproj_ 1.4-2%2Bpx1_ FAILEDTOBUILD. txt.gz
a stored path that fails:
/35262589/
original url
https:/
the librarian log shows buildlog_ ubuntu- karmic- i386.openproj_ 1.4-2+px1_ FAILEDTOBUILD. txt.gz? token=
GET /35262589/
(note the %2B is gone, + is present instead)
But the DB table has:
19:42 < spm> /35262589/ buildlog_ ubuntu- karmic- i386.openproj_ 1.4-2%2Bpx1_ FAILEDTOBUILD. txt.gz | xxxxxxxxxxxxxxx xxxxxxxxxxx | 2010-11-26 06:26:33.243417
so the lookup fails, and *boom*. We shouldn't decode the url and query, because thats a liability with various escaping tricks that folk can play.
We've checked direct queries to the librarian with the right host, token and the %2B urls work - they do. So its squid or apache breaking things.