as a reverse proxy, a 100 continue response is sent prematurely when a request contains expects: 100-continue

Bug #1641238 reported by Jay R. Wren on 2016-11-11
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Apache2 Web Server
Confirmed
Critical
apache2 (Ubuntu)
Medium
Unassigned

Bug Description

This effects trusty, xenial and current httpd trunk.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60330

As a reverse proxy, a 100 continue response is sent prematurely when a request contains expects: 100-continue. This causes the requesting client to send a body. The apache httpd proxy will then read the body and attempt to send it to the backend, but the backend already sent an error and should be allowed to NOT read the remaining request body, which never should have existed. When the backend does not read the request body mod_proxy_pass errors and returns a 500 error to the client. The client never receives the correct error message.

Reverse proxy of 100-continue aware backend, sends 100 continue even when backend does not. This causes a client to think it should write a request body, while the backend may still respond with a 400 and not read the request body. mod_proxy_http then responds with 502 as a result of AH01097: pass request body failed

The backend is doing the right thing: it did not send a 100 continue so it should not be required to read a request body, regardless of transfer encoding or content-length.

Expected:

mod_proxy_http reverse proxy should not send 100-continue to a client unless the backend does.

Created attachment 34438
in the reverse proxy case, if request had Expects: 100-continue, delay writing 100 continue response until backend has sent 100-continue response

Created attachment 34451
Forward 100-continue (and minimize race when reusing backend connections)

I proposed this patch a while ago on the dev@ list ([1]), this is an update for latest trunk, with more (though incomplete) testing.

Could you please give it a try?

[1]. https://lists.apache.org/thread.html/4e541e032b8a77ebec8248534637b47cdcd4f38af79baa5259845db0@1430360070@%3Cdev.httpd.apache.org%3E

*** Bug 55433 has been marked as a duplicate of this bug. ***

Robie Basak (racb) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better.

I think we should wait until upstream commits a fix for this before we do anything in Ubuntu. Or is there a specific need that requires this to fixed sooner in Ubuntu? Under what circumstances are Ubuntu users impacted by this problem?

Jay R. Wren (evarlast) wrote :

Hi Robie,

The specific need is that www.jujucharms.com uses apache2 as a reverse proxy to charmstore. The upload of resources and charms is suboptimal without 100-continue support. While investigating adding 100-continue support, we ran into this bug. As for a general need for this to be fixed in Ubuntu, I know of none, other than correctly supporting 100-continue in http reverse proxies.

Only Ubuntu users using mod_proxy_http to reverse proxy a service which implements 100-continue are impacted by this problem. There must not be any. I'm honestly surprised that this bug has existed for so long.

I am working with upstream to get this or a similar patch applied. I'll continue doing so.

Thanks,
--
Jay

Yann,

I tried that patch, but I still get 503 error when expecting a 100 Continue response.

Any chance that this will be fixed? Have the very same problem from a backend Tomcat. I guess I need to drop mod_proxy and try mod_ajp or drop Apache HTTPd altogether for this.

*** Bug 57853 has been marked as a duplicate of this bug. ***

Sorry it's been a long time, I think we need more informations here as to the exact issue.

What exactly isn't working with the proposed patch?
Where are 100-continue or request bodies lacking or sent inappropriately, on which side (client/backend)?
IOW, can we please have a description/scenario of what is supposed to work and how, possibly with the expected request/response on both sides?
What is the configuration being tested?

Changed in apache2:
importance: Unknown → Medium
status: Unknown → Confirmed

(In reply to Yann Ylavic from comment #7)
> Sorry it's been a long time, I think we need more informations here as to
> the exact issue.
>
> What exactly isn't working with the proposed patch?
> Where are 100-continue or request bodies lacking or sent inappropriately, on
> which side (client/backend)?
> IOW, can we please have a description/scenario of what is supposed to work
> and how, possibly with the expected request/response on both sides?
> What is the configuration being tested?

Hi Yann,

I can provide a full verbose log of curl(1) for Tomcat behind HTTPd with faulty behavior and direct Tomcat communication. Moreover, I can expore the httpd.conf for that offending behavior.

Yes please do, along with the httpd error_log with LogLevel trace7.
Thanks!

Created attachment 36015
curl(1) to Tomcat directly

Created attachment 36016
curl(1) to Tomcat via HTTPd

The error log has been sent privately due to sensitive data. Looking forward to an analysis.

FYI, I have tried mod_proxy_{http,ajp} and mod_jk.

Thanks Michael, at first glance the error_log is with mod_proxy_ajp, while attachment 34451 is about mod_proxy_http (and I'd like to keep the scope there for now).
I agree that unpatched mod_proxy_http sends "100 continue" too soon (actually independently on the client and backend side).
The patch is precisely to avoid that (hop by hop 100-continue handling), did you give it a try? If yes, could I have the error_log with mod_proxy_http?

(In reply to Yann Ylavic from comment #14)
> Thanks Michael, at first glance the error_log is with mod_proxy_ajp, while
> attachment 34451 [details] is about mod_proxy_http (and I'd like to keep the
> scope there for now).
> I agree that unpatched mod_proxy_http sends "100 continue" too soon
> (actually independently on the client and backend side).
> The patch is precisely to avoid that (hop by hop 100-continue handling), did
> you give it a try? If yes, could I have the error_log with mod_proxy_http?

I agree, I have tried all possible modules with the same negative result. I will redo for you. Moreover, I will compile from trunk along with your patch and try to reproduce. Does it still apply cleanly to trunk?

Please be patient, I won't be able to test anything before 2018-07-23.

Thank you.

Created attachment 36046
web.xml (to put in webapps/ROOT/WEB-INF/)

Created attachment 36047
jsp to demonstrate the issue (put it in webapps/ROOT)

when using the web.xml and red.jsp and having
<tomcat-users xmlns="http://tomcat.apache.org/xml"
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
              xsi:schemaLocation="http://tomcat.apache.org/xml tomcat-users.xsd"
              version="1.0">
  <role rolename="manager"/>
  <user username="tomcat" password="tomcat" roles="manager"/>
</tomcat-users>
in conf/tomcat-users.xml in tomcat you can get the problem with basic authentication:
curl --verbose --anyauth -u tomcat:tomcat http://localhost:8000/read.jsp -X POST -T file -H "Content-Type: text/plain"

file needs to be big enough ~8.5M

you will get:
[jfclere@dhcp-144-173 build]$ curl --verbose --anyauth -u tomcat:tomcat http://localhost:8000/read.jsp -X POST -T toto -H "Content-Type: text/plain"
* Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 8000 (#0)
> POST /read.jsp HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.59.0
> Accept: */*
> Content-Type: text/plain
> Content-Length: 8849904
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 504 Gateway Timeout
< Date: Mon, 23 Jul 2018 20:38:21 GMT
< Server: Apache/2.5.1-dev (Unix) OpenSSL/1.1.0h-fips
< Content-Length: 247
< Content-Type: text/html; charset=iso-8859-1
<
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>504 Gateway Timeout</title>
</head><body>
<h1>Gateway Timeout</h1>
<p>The gateway did not receive a timely response
from the upstream server or application.</p>
</body></html>
* Connection #0 to host localhost left intact

Created attachment 36048
Patch for trunkthat works for me (based on Yann one)

Thanks Jean-Frédéric, applied in r1836588.

This is the same patch (plus a req != NULL check) rebased on latest trunk, right?

Commit on trunk does not work for me. I will send logs privately to Yann.

Thanks for testing Michael.

Fixed in r1836648 (hopefully).

Changed in apache2 (Ubuntu):
status: New → Triaged
importance: Undecided → Medium

Fixed according to my tests.

Can we please backport this to 2.4.x branch? It is not present in 2.4.35 :-(

Folks, can we please backport to 2.4.38 finally?

This also blocks CONNECTORS-1564.

Changed in apache2:
importance: Medium → Critical

Proposed for backport to 2.4.x (r1853409), let's see what other committers think, notably if it's too much of a change for the stable branch. If so, this will have to wait for next 2.6/3.0...

Also, testing the tomcat CONNECTORS-1564 case with the final patch ([1]) would be very welcome ;)

[1] http://people.apache.org/~ylavic/patches/httpd-2.4.x-forward_100_continue.patch

(In reply to Yann Ylavic from comment #28)
> tomcat CONNECTORS-1564

Looks like it's Solr actually, but all tests are welcome.

(In reply to Yann Ylavic from comment #27)
> Proposed for backport to 2.4.x (r1853409), let's see what other committers
> think, notably if it's too much of a change for the stable branch. If so,
> this will have to wait for next 2.6/3.0...

Yann, I will happily try this patch against 2.4.x next week. It would be quite disappointing if it won't make into 2.4.x because I this isn't an improvement, but a serious bug.

Download full text (12.6 KiB)

Dropped patch v2 to the files directory of the FreeBSD ports, patch applied cleanly, reinstalled 2.4.38 and ran my test:

> $ curl --verbose -X POST --upload $HOME/sitex-document.3305025974265279540.zip https://blnn719x.ad001.siemens.net/ld-docgen/rest/documents --negotiate -u : -H 'Content-Type: application/zip'
> * Expire in 0 ms for 6 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms for 1 (transfer 0x28845000)
> * Expire in 1 ms for 1 (transfer 0x28845000)
> * Expire in 0 ms...

Thanks Michael for the very appropriate test: first 401 which avoids the "100 continue" danse and thus gets forwarded with "connection: close", second 201 with the full thing.

Great!

(In reply to Yann Ylavic from comment #32)
> Thanks Michael for the very appropriate test: first 401 which avoids the
> "100 continue" danse and thus gets forwarded with "connection: close",
> second 201 with the full thing.
>
> Great!

Thank your for the extensive patch. Looking forward to see this on 2.4.x and mod_proxy_ajp to be fixed afterwards.

I have just noticed that v4 is available of the patch. I have removed v2 from ${FILESDIR} and added v4, recompiled and reinstalled apach24 port. It works flawlessly as before with v2.

What did change? Does it affect the user?

I do not know about v3 exactly, but v4 addresses a side effect the change had for the HTTP/2 protocol implementation.

We also happen to stumble accross this bug. Using Apache 2.4.38 and Tomcat 7.0.90.

Applying the patch http://people.apache.org/~ylavic/patches/httpd-2.4.x-forward_100_continue.patch did not fix the problem in our case. Is this the correct patch?

Thanks, I tried with v4 patch as well, still recieving a 502 bad gateway after sending a file via POST.

curl -v 'http://localhost:8080/action/forms/sprint-12/contact-page-85176' -X POST -T /tmp/test.dd
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
> POST /action/forms/sprint-12/contact-page-85176 HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.58.0
> Accept: */*
> Content-Length: 104857600
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
< Via: 1.1 xyz.com:8080
* We are completely uploaded and fine
< HTTP/1.1 502 Bad Gateway
< Date: Fri, 08 Mar 2019 14:34:52 GMT
< Server: Apache
< Content-Length: 293
< Content-Type: text/html; charset=iso-8859-1
<
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Bad Gateway</title>
</head><body>
<h1>Bad Gateway</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
</p>
<hr>
<address>Apache Server at localhost Port 8080</address>
</body></html>
* Connection #0 to host localhost left intact

Apache logs this directly after sending the request:

[Fri Mar 08 14:32:32.808488 2019] [proxy:error] [pid 15:tid 140341990573824] (104)Connection reset by peer: [client 10.244.13.181:53536] AH01084: pass request body failed to 127.0.0.1:42180 (localhost), referer: https://xyz.com/sprint-12/contact-page
[Fri Mar 08 14:32:32.808571 2019] [proxy_http:error] [pid 15:tid 140341990573824] [client 10.244.13.181:53536] AH01097: pass request body failed to 127.0.0.1:42180 (localhost) from 10.244.13.181 (), referer: https://xyz.com/sprint-12/contact-page

Looks like a backend error, could you please provide the error log with LogLevel trace7 and mod_dumpio configured (DumpIoInput on, DumpIoOutput on)?

Created attachment 36478
apache-trace7-dumpio logfile

I added the log as attachment.
If I'm really mistaking something else for this bug I'm sorry.

Everything looks fine until:
[Fri Mar 08 15:23:06.165463 2019] [proxy:error] [pid 869:tid 140678264567552] (104)Connection reset by peer: [client 127.0.0.1:53920] AH01084: pass request body failed to 127.0.0.1:42180 (localhost)
[Fri Mar 08 15:23:06.165530 2019] [proxy_http:error] [pid 869:tid 140678264567552] [client 127.0.0.1:53920] AH10154: pass request body failed to 127.0.0.1:42180 (localhost) from 127.0.0.1 () with status 502

After that httpd eats the rest of the request body (more than 100MB overall), and responds with 502. The backend really looks faulty here.

You're right, the backend was faulty here, we could fix the issue there and as soon as we did that it acutally worked with the unpatched version again. Sorry for the noise.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.