mod_deflate with mod_fastcgi gives wrong content-length header

Bug #381384 reported by Martin von Gagern on 2009-05-28
38
This bug affects 6 people
Affects Status Importance Assigned to Milestone
libapache-mod-fastcgi (Debian)
Fix Released
Unknown
libapache-mod-fastcgi (Ubuntu)
Medium
Unassigned

Bug Description

Binary package hint: libapache-mod-fastcgi

When compressing the content generated by mod_fastcgi, mod_deflate will keep any Content-Length header created by the script, instead of adjusting the header to match the compressed length.

References:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=509116
http://article.gmane.org/gmane.comp.web.fastcgi.devel/1167

It seems that in surprisingly many cases this problem will go unnoticed by end users. Using Trac via FastCGI, however, I have encountered this issue repeatedly, and could even capture one instance using wireshark.
What I did was this:
1. Log into Trac account
2. Re-visit /trac/login
3. Get surprised by a Firefox dialog asking me to save the binary file "trac".

The traffic on the wire, starting at point 2 and reduced to the relevant parts:

> GET /trac/login/ HTTP/1.1
> Keep-Alive: 300
> Connection: keep-alive

< HTTP/1.1 302 Found
< Content-Length: 0
< Location: http://host.name/trac
< Content-Encoding: gzip
< Keep-Alive: timeout=15, max=100
< Connection: Keep-Alive
<
< [gzipped data, 0 bytes uncompressed, 26 bytes compressed]

> GET /trac HTTP/1.1
> Keep-Alive: 300
> Connection: keep-alive

< HTTP/1.1 301 Moved Permanently
< Content-Encoding: gzip
< Content-Length: 239
< Keep-Alive: timeout=15, max=99
< Connection: Keep-Alive
<
< [gzipped data, 297 bytes uncompressed, 239 bytes compressed]

What happens is this: firefox receives the first response, reads 0 bytes of content, and either because it didn't find any gzip header in those 0 bytes, or because it gives Content-Length preference over Content-Encoding in at least this case of a broken HTTP response, it decides that's it. The first 16 bytes of the gzipped empty content which were part of the same TCP frame as the headers seem to be discarded. However, the remaining 10 bytes are still on the wire, in their own TCP frame.

Next, FF sends a request for the next location. In this case, it matches a Redirect directive in my Apache configuration, so the next reply is generated by Apache itself, not through mod_fastcgi. The response is all right, but unfortunately, when reading it, FF first receives those 10 bytes left over from the last request. Seeing binary data instead of headers, it decides this must be a bad server sending binary content without headers, and offers to save this assumed binary file. When loking at the file, it's those 10 bytes, followed by the headers and compressed content for the next reply.

I had a look at some non-empty responses generated by trac/fast_cgi/mod_deflate, and they, too, have Content-Length headers to match the uncompressed content. In those cases, however, it seems FF uses the gzip end-of-data encoding instead of relying on the Content-Length header. So the HTTP protocol is still violated, but the browser is lenitent enough so users won't notice.

http://article.gmane.org/gmane.comp.web.fastcgi.devel/1167 gives a good indication as to what's causing this problem. Seems that mod_fastcgi copies all headers sent by a script to err_headers_out, with the exception of a few well known like "Location", which it copies to headers_out instead. There is even a patch to do the same special-case handling for Content-Length, but the author argues that other headers might require similar behaviour.

The Debian bug does indicate libapache2_mod_fcgid as an alternative which doesn't exhibit this behaviour. It is not an easy drop-in replacement, though, as it doesn't accept extra path components after the file name in a ScriptAlias, and as the Apache configuration has to be changed due to a different module file name for IfModule checks and different configuration directives. So simply switching is not always a solution.

Changed in libapache-mod-fastcgi (Debian):
status: Unknown → New
Martin von Gagern (gagern) wrote :

http://thread.gmane.org/gmane.comp.web.fastcgi.devel/2613 indicates another approach towards a solution, and it seems that a variation of it has been committed to some mod_fastcgi repository (though not the git at http://repo.or.cz/w/mod_fastcgi.git). It's also available in a snapshot dated 0811090952, but hasn't been officially released yet. The chunk in question is this one:

                 continue;
             }
+
+ if (strcasecmp(name, "Content-Length") == 0) {
+ ap_table_set(r->headers_out, name, value);
+ continue;
+ }

             /* If the script wants them merged, it can do it */
             ap_table_add(r->err_headers_out, name, value);

Daniel Hahler (blueyed) wrote :

Thanks for your investigations and providing a (possible) patch; I'm triaging this bug accordingly.

Changed in libapache-mod-fastcgi (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
tags: added: patch
Martin von Gagern (gagern) wrote :

For those interested in mod_fastcgi development in general, or in the greater context of the upstream fix, I'm attaching a diff between the 2.4.6 release and the SNAP-0811090952 snapshot source tree. I've written to the fastcgi-developers mailing list, pointing at this issue here and asking as to when a release containing this fix can be expected. As I'm not subscribed to the list, my post is still awaiting approval.

Comparison of the possible fixes:

http://www.yl.is.s.u-tokyo.ac.jp/~oiwa/pub/unix/fastcgi-content-length.patch
Uses headers_out, errors on duplicate Content-Length headers.

http://article.gmane.org/gmane.comp.web.fastcgi.devel/2613
Uses ap_set_content_length instead of headers_out.

SNAP-0811090952 according to attached patch
Uses headers_out, no check for duplicate Content-Length headers.

The snapshot also removes duplicate header checks in other cases, so this is consistent.
For the Content-Type header there is a case distinction, and it uses ap_set_content_type for APACHE2. ap_set_content_length sounds like a counterpart to this, but I guess upstream had reasons to change the patch from the one sent to them, and Content-Type was somwhat different in Apache 1 builds as well, as it had a dedicated field instead of the headers_out table.

On the whole, I'd go with the upstream changes.

Martin von Gagern (gagern) wrote :

This is the hunk I already pasted above, plus its credit, this time as a proper patch attachment so we won't have to worry about whitespace or similar issues.

Changed in libapache-mod-fastcgi (Debian):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.