autopkgtest failures due to squid "(111) Connection refused" on armhf

Bug #1939086 reported by Bryce Harrington
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
thin (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

https://autopkgtest.ubuntu.com/results/autopkgtest-impish/impish/armhf/t/thin/20210701_004508_69a07@/log.gz

Five test cases fail on armhf for thin/1.8.0-1 (thin/1.7.2-1build1 is fine):

  1) Thin::Server on TCP socket should POST from Net::HTTP
     Failure/Error: expect(post('/', :arg => 'pirate')).to include('arg=pirate')

       expected "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01//EN\" \"http://www.w3.org/TR/html4/strict.dtd\">\n<htm...021 00:44:47 GMT by cremona (squid/3.5.27)</p>\n<!-- ERR_CONNECT_FAIL -->\n</div>\n</body></html>\n" to include "arg=pirate"
       ...
       +<h2>The requested URL could not be retrieved</h2>
       +<p>The following error was encountered while trying to retrieve the URL: <a href="http://0.0.0.0:3333/">http://0.0.0.0:3333/</a></p>
       +<p><b>Connection to 0.0.0.0 failed.</b></p>
       +<p id="sysmsg">The system returned: <i>(111) Connection refused</i></p>

  2) Thin::Server on TCP socket should handle big POST
     Failure/Error: expect(post('/', :big => big)).to include(big)

       expected "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01//EN\" \"http://www.w3.org/TR/html4/strict.dtd\">\n<htm...021 00:44:47 GMT by cremona (squid/3.5.27)</p>\n<!-- ERR_CONNECT_FAIL -->\n</div>\n</body></html>\n" to include "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
       ...
       +<h2>The requested URL could not be retrieved</h2>
       +<p>The following error was encountered while trying to retrieve the URL: <a href="http://0.0.0.0:3333/">http://0.0.0.0:3333/</a></p>
       +<p><b>Connection to 0.0.0.0 failed.</b></p>
       +<p id="sysmsg">The system returned: <i>(111) Connection refused</i></p>

  3) Thin::Server on TCP socket should retreive remote address
     Failure/Error: expect(get('/')).to include('"REMOTE_ADDR"=>"127.0.0.1"')

       expected "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01//EN\" \"http://www.w3.org/TR/html4/strict.dtd\">\n<htm...021 00:44:47 GMT by cremona (squid/3.5.27)</p>\n<!-- ERR_CONNECT_FAIL -->\n</div>\n</body></html>\n" to include "\"REMOTE_ADDR\" => \"127.0.0.1\""
       ...
       +<h2>The requested URL could not be retrieved</h2>
       +<p>The following error was encountered while trying to retrieve the URL: <a href="http://0.0.0.0:3333/">http://0.0.0.0:3333/</a></p>
       +<p><b>Connection to 0.0.0.0 failed.</b></p>
       +<p id="sysmsg">The system returned: <i>(111) Connection refused</i></p>

  4) Thin::Server with threads should process request
     Failure/Error: raise "Reactor still running, wtf?" if EventMachine.reactor_running?

     RuntimeError:
       Reactor still running, wtf?
     # ./spec/spec_helper.rb:209:in `stop_server'
     # ./spec/server/threaded_spec.rb:25:in `block (2 levels) in <top (required)>'

  5) Thin::Server with threads should process requests when blocked
     Failure/Error: expect(@requests).to eq(1)

       expected: 1
            got: 0

       (compared using ==)
     # ./spec/server/threaded_spec.rb:20:in `block (2 levels) in <top (required)>'

Debian does not appear to be seeing these failures on armhf:
https://buildd.debian.org/status/fetch.php?pkg=thin&arch=armhf&ver=1.8.0-1&stamp=1611145895&raw=0

The first recorded autopkgtest for 1.8.0-1 was April 2021, soon after impish opened.

The "ERR_CONNECT_FAIL" in the first three cases show Squid is hitting a Connection refused on armhf for some reason. I don't know if failures #4 and #5 are related but probably not a bad guess to assume they are.

Perhaps setting up a squid proxy environment would allow reproduction?
https://wiki.ubuntu.com/ProposedMigration#I.27m_seeing_a_squid_proxy.3F__My_tests_behave_differently_there.21

Related branches

Revision history for this message
Simon Chopin (schopin) wrote :

For reference, the Debian autopkgtest run (successful) can be found here : https://ci.debian.net/packages/t/thin/testing/armhf/

Revision history for this message
Steve Langasek (vorlon) wrote :

I believe the problem here is that ruby honors http_proxy, but does not honor no_proxy; so because we're in an environment requiring a proxy, requests that are supposed to go to localhost are sent to the squid proxy instead.

What I'm not clear on is why this works on runners *except* for armhf, since all of these runners set http_proxy.

Unfortunately, gem2deb-test-runner does not have support for overriding the environment the way that pkg-perl-autopkgtest does (via debian/tests/pkg-perl/smoke-env).

Someone who knows ruby could modify the tests to unset http_proxy from the environment, which ought to address the problem.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

I spent some time investigating this bug, and my conclusion is that the problem lies elsewhere (i.e., it's not related to http_proxy being set).

In a nutshell, what I found is that the delay used in the "stop_server" procedure is too low for armhf machines. Here's the code in question:

200 def stop_server
201 @server.stop!
202 @thread.kill
203
204 100.times do
205 break unless EM.reactor_running?
206 sleep 0.01
207 end
208
209 raise "Reactor still running, wtf?" if EventMachine.reactor_running?
210 end

As can be seen, there's a 1 second delay to let the server actually stop. This seems to be enough in all architectures but armhf.

I don't know why the same problem doesn't manifest on Debian. Maybe because they have bare metal armhf servers, whereas we use lxd instances inside arm64 machines?

Anyway, I was able to verify that if we extend this delay to, say, 10 seconds, the tests pass on armhf. I also think that's something OK to do, especially because the loop is actively checking if the server has stopped and will break the loop if that's the case.

I'll put an MP up soon.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package thin - 1.8.0-1ubuntu1

---------------
thin (1.8.0-1ubuntu1) impish; urgency=medium

  * Fix autopkgtest failure on armhf (LP: #1939086).
    - d/ruby-tests.rake: Unset http_proxy env variable.
    - d/p/increase-wait-time-for-server-stop.patch: Increase timeout
      when waiting for server to stop.

 -- Sergio Durigan Junior <email address hidden> Tue, 31 Aug 2021 15:40:13 -0400

Changed in thin (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.