restricted librarian urls give a 404 if normalised (e.g. by apache, chromium, often shows up on private PPA build logs)

Bug #677270 reported by Alex Chiang on 2010-11-19
116
This bug affects 17 people
Affects Status Importance Assigned to Milestone
Launchpad itself
High
William Grant

Bug Description

The restricted librarian generates non-canonical form urls, these can then be changed by canonicalising clients / intermediaries. Changing the restricted librarian urls causes the token to not match and a 404 - file not found - is returned to the client.

Apache without the nocanon config option will canonicalise, and some browsers like Chrome are known to canonicalise too.

Fairly simple file names - 'foo+bar.txt' - will show this problem.

Workarounds
===========

Use Firefox, run apache with nocanon on proxypass rules. We are currently doing the latter in the Canonical datacentre.

Proposed solutions
==================

* Change the url generation in Launchpad to be canonicalised, then canonicalisation will not change the url and things will Just Work.

Related branches

Robert Collins (lifeless) wrote :

I see a 404 with your url (note that the i12345 urls are time limited, and will give anyone that can copy them access to the content until the time expires -(24 hours at the moment).

Try putting the url into wget and see if it works any better, I suspect that the librarian is sending wonky content-encoding headers for some reason.

affects: launchpad → launchpad-foundations
Changed in launchpad-foundations:
importance: Undecided → High
William Grant (wgrant) wrote :

It's a 404 page labelled as gzip-encoded, when it's in fact not compressed at all.

Curtis Hovey (sinzui) on 2010-11-19
Changed in launchpad-foundations:
status: New → Triaged
Rick McBride (rmcbride) wrote :

I'm getting the same behavior, I'm getting a
Error 330 (net::ERR_CONTENT_DECODING_FAILED): Unknown error.

wget returns a 404

Francis J. Lacoste (flacoste) wrote :

We turned the public restricted librarian feature off for now. This is an apache config level problem, I'll file a RT to get sorted out before we re-enable the feature.

Changed in launchpad-foundations:
status: Triaged → In Progress
assignee: nobody → Canonical LOSAs (canonical-losas)

Thanks for snalyzing this. I'm a little surprised that it's
apache...what's causing the issue?

RT #42560 tracks the LOSA-side of thing.

Robert Collins (lifeless) wrote :

It seems to work on production atm in testing with spm.

Robert Collins (lifeless) wrote :

This isn't apache.

Robert Collins (lifeless) wrote :

Or rather, its not entirely apache. let me expand.

We see a 404 on the files that choose to break. The 404 has bad content-encoding headers, and we should start doing those headers in the librarian not in apache. I'll file a separate bug on that.

The failing URLs all have % escaped characters. The tokens that are being stored have
a stored path that fails:
/35262589/buildlog_ubuntu-karmic-i386.openproj_1.4-2%2Bpx1_FAILEDTOBUILD.txt.gz
original url
https://launchpad.net/~soyuz-team/+archive/ppa/+build/1334368/+files/buildlog_ubuntu-karmic-i386.openproj_1.4-2%2Bpx1_FAILEDTOBUILD.txt.gz

the librarian log shows
GET /35262589/buildlog_ubuntu-karmic-i386.openproj_1.4-2+px1_FAILEDTOBUILD.txt.gz?token=
(note the %2B is gone, + is present instead)

But the DB table has:

19:42 < spm> /35262589/buildlog_ubuntu-karmic-i386.openproj_1.4-2%2Bpx1_FAILEDTOBUILD.txt.gz | xxxxxxxxxxxxxxxxxxxxxxxxxx | 2010-11-26 06:26:33.243417

so the lookup fails, and *boom*. We shouldn't decode the url and query, because thats a liability with various escaping tricks that folk can play.

We've checked direct queries to the librarian with the right host, token and the %2B urls work - they do. So its squid or apache breaking things.

summary: - restricted librarian broken, content decoding error
+ apache/squid breaks restricted librarian on urls with percent encoded
+ characters.
description: updated

I need to leave this - https://issues.apache.org/bugzilla/show_bug.cgi?id=32328#c12 is very relevant.

It seems like apache is known broken here, and there have been multiple attempts to fix it, but its fundamentally not designed as a proxy.

summary: - apache/squid breaks restricted librarian on urls with percent encoded
+ apache breaks restricted librarian on urls with percent encoded
characters.
description: updated

elmo suggests that 'nocanon' in the proxypass rule will do what we need.

summary: - apache breaks restricted librarian on urls with percent encoded
- characters.
+ restricted librarian urls give a 404 if normalised (e.g. by apache,
+ chromium, often shows up on private PPA build logs)
Changed in launchpad:
assignee: Canonical LOSAs (canonical-losas) → nobody
status: In Progress → Triaged
description: updated
Andreas Hasenack (ahasenack) wrote :

Any fix in sight? This is marked "High".

Robert Collins (lifeless) wrote :

We're currently burning down a backlog of criticals; this is unlikely to get any attention before then. If you would like to submit a patch for this we'd be happy to find you a mentor for it.

Robert Collins (lifeless) wrote :

(Note that this really reflects a bug in chromium - but one we can work around, and be more inline with the rfc at the same time)

tags: added: easy
Andreas Hasenack (ahasenack) wrote :

Do you have a ticket for the chromium bug? They do make new releases frequently and we could perhaps get someone's attention over there.

On Thu, Jan 26, 2012 at 2:35 AM, Andreas Hasenack <email address hidden> wrote:
> Do you have a ticket for the chromium bug? They do make new releases
> frequently and we could perhaps get someone's attention over there.

AIUI it is deliberate on chromiums part, so we haven't opened a
ticket. As noted we can workaround it here by issuing canonicalised
urls, though that shouldn't be needed.

Would you like to open the ticket?

-Rob

Andreas Hasenack (ahasenack) wrote :

Sure, can you summarize the problem and say what chrome/chromium is doing differently from firefox? I'll be happy to open a ticket upstream and reference this bug even.

Launchpad QA Bot (lpqabot) wrote :
Changed in launchpad:
assignee: nobody → William Grant (wgrant)
tags: added: qa-needstesting
Changed in launchpad:
status: Triaged → Fix Committed
William Grant (wgrant) on 2015-12-11
tags: added: qa-ok
removed: qa-needstesting
Colin Watson (cjwatson) on 2016-01-08
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.