intermittent test failure in upgrader.SetUpTest when using GOMAXPROCS>1

Bug #1199437 reported by Dimiter Naydenov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
Critical
John A Meinel

Bug Description

I got this now (r1413 on trunk), but I've seen it before, so decided to report it. Occurs very rarely and it appears to be some timeout issue.

----------------------------------------------------------------------
FAIL: upgrader_test.go:45: upgraderSuite.SetUpTest

[LOG] 32.56385 INFO juju environs/testing: uploading FAKE tools 1.11.2-raring-amd64
[LOG] 32.56395 INFO juju environs/testing: uploading FAKE tools 1.11.2-precise-amd64
[LOG] 32.56405 INFO juju environs: reading tools with major version 1
[LOG] 32.56409 DEBUG juju environs/tools: reading v1.* tools
[LOG] 32.56413 INFO juju environs: falling back to public bucket
[LOG] 32.56416 DEBUG juju environs/tools: reading v1.* tools
[LOG] 32.56423 DEBUG juju environs/tools: found 1.11.2-precise-amd64
[LOG] 32.56429 DEBUG juju environs/tools: found 1.11.2-raring-amd64
[LOG] 32.56433 INFO juju environs: filtering tools by series: precise
[LOG] 32.56438 INFO juju environs: filtering tools by version: 1.11.2
[LOG] 32.56443 INFO juju environs/dummy: would pick tools from 1.11.2-precise-amd64
[LOG] 32.62548 INFO juju state: opening state; mongo addresses: ["localhost:35021"]; entity ""
[LOG] 32.63101 INFO juju state: connection established
[LOG] 32.68732 INFO juju state: initializing environment
[LOG] 32.71977 INFO juju state/api: listening on "127.0.0.1:45160"
[LOG] 32.77023 INFO juju state: opening state; mongo addresses: ["localhost:35021"]; entity ""
[LOG] 32.77572 INFO juju state: connection established
[LOG] 32.77647 INFO juju juju: authorization error while connecting to state server; retrying
[LOG] 32.77658 INFO juju state: opening state; mongo addresses: ["localhost:35021"]; entity ""
[LOG] 32.78523 INFO juju state: connection established
[LOG] 32.89387 INFO juju state/api: dialing "wss://127.0.0.1:45160/"
[LOG] 32.89990 INFO juju state/api: connection established
[LOG] 32.90015 DEBUG juju rpc/jsoncodec: <- {"RequestId":1,"Type":"Admin","Request":"Login","Params":{"AuthTag":"user-admin","Password":"dummy-secret"}}
[LOG] 32.90107 DEBUG juju rpc/jsoncodec: -> {"RequestId":1,"Response":{}}
[LOG] 32.90194 DEBUG juju rpc/jsoncodec: <- {"RequestId":2,"Type":"Pinger","Request":"Ping","Params":{}}
[LOG] 32.90210 DEBUG juju rpc/jsoncodec: -> {"RequestId":2,"Response":{}}
upgrader_test.go:62:
    c.Assert(err, IsNil)
... value *net.OpError = &net.OpError{Op:"listen", Net:"tcp", Addr:(*net.TCPAddr)(0xf84015cd20), Err:0x62} ("listen tcp 127.0.0.1:12345: address already in use")

----------------------------------------------------------------------
FAIL: upgrader_test.go:74: upgraderSuite.TearDownTest

upgrader_test.go:77:
    c.Check(err, IsNil)
... value *errors.errorString = &errors.errorString{s:"already closed"} ("already closed")

----------------------------------------------------------------------
PANIC: upgrader_test.go:93: upgraderSuite.TestSetToolsWrongMachine

[LOG] 32.94387 DEBUG juju rpc/jsoncodec: <- error: EOF (closing false)
[LOG] 32.94402 INFO juju environs/dummy: reset environment
... Panic: Fixture has panicked (see related PANIC)
OOPS: 2 passed, 2 FAILED, 3 MISSED
--- FAIL: TestAll (2.74 seconds)
FAIL
FAIL launchpad.net/juju-core/state/api/upgrader 3.194s

Related branches

Revision history for this message
Dimiter Naydenov (dimitern) wrote :
Download full text (5.8 KiB)

I just got this in another module - it might be related to the API server actually (also intermittent).

----------------------------------------------------------------------
FAIL: watcher_test.go:46: watcherSuite.SetUpTest

[LOG] 66.97901 INFO juju environs/testing: uploading FAKE tools 1.11.2-raring-amd64
[LOG] 66.97911 INFO juju environs/testing: uploading FAKE tools 1.11.2-precise-amd64
[LOG] 66.97923 INFO juju environs: reading tools with major version 1
[LOG] 66.97927 DEBUG juju environs/tools: reading v1.* tools
[LOG] 66.97931 INFO juju environs: falling back to public bucket
[LOG] 66.97934 DEBUG juju environs/tools: reading v1.* tools
[LOG] 66.97943 DEBUG juju environs/tools: found 1.11.2-precise-amd64
[LOG] 66.97950 DEBUG juju environs/tools: found 1.11.2-raring-amd64
[LOG] 66.97954 INFO juju environs: filtering tools by series: precise
[LOG] 66.97959 INFO juju environs: filtering tools by version: 1.11.2
[LOG] 66.97964 INFO juju environs/dummy: would pick tools from 1.11.2-precise-amd64
[LOG] 67.02753 INFO juju state: opening state; mongo addresses: ["localhost:52824"]; entity ""
[LOG] 67.04133 INFO juju state: connection established
[LOG] 67.09910 INFO juju state: initializing environment
[LOG] 67.14371 INFO juju state/api: listening on "127.0.0.1:50388"
[LOG] 67.20279 INFO juju state: opening state; mongo addresses: ["localhost:52824"]; entity ""
[LOG] 67.20862 INFO juju state: connection established
[LOG] 67.20930 INFO juju juju: authorization error while connecting to state server; retrying
[LOG] 67.20941 INFO juju state: opening state; mongo addresses: ["localhost:52824"]; entity ""
[LOG] 67.21293 INFO juju state: connection established
[LOG] 67.32584 INFO juju state/api: dialing "wss://127.0.0.1:50388/"
[LOG] 67.33283 INFO juju rpc: discarding action method reflect.Method{Name:"apiRootForEntity", PkgPath:"launchpad.net/juju-core/state/apiserver", Type:(*reflect.commonType)(0x7ea1a0), Func:reflect.Value{typ:(*reflect.commonType)(0x7ea1a0), val:(unsafe.Pointer)(0x4e231c), flag:0x131}, Index:1}
[LOG] 67.33302 INFO juju state/api: connection established
[LOG] 67.33371 DEBUG juju rpc/jsoncodec: <- {"RequestId":1,"Type":"Admin","Request":"Login","Params":{"AuthTag":"user-admin","Password":"dummy-secret"}}
[LOG] 67.33457 INFO juju rpc: discarding obtainer method reflect.Method{Name:"AuthClient", PkgPath:"", Type:(*reflect.commonType)(0x732788), Func:reflect.Value{typ:(*reflect.commonType)(0x732788), val:(unsafe.Pointer)(0x4e3772), flag:0x130}, Index:1}
[LOG] 67.33464 INFO juju rpc: discarding obtainer method reflect.Method{Name:"AuthEnvironManager", PkgPath:"", Type:(*reflect.commonType)(0x732788), Func:reflect.Value{typ:(*reflect.commonType)(0x732788), val:(unsafe.Pointer)(0x4e372c), flag:0x130}, Index:2}
[LOG] 67.33471 INFO juju rpc: discarding obtainer method reflect.Method{Name:"AuthMachineAgent", PkgPath:"", Type:(*reflect.commonType)(0x732788), Func:reflect.Value{typ:(*reflect.commonType)(0x732788), val:(unsafe.Pointer)(0x4e364f), flag:0x130}, Index:3}
[LOG] 67.33476 INFO juju rpc: discarding obtainer method reflect.Method{Name:"AuthOwner", PkgPath:"", Type:(*reflect.commonType)(0x78dc38), Func:reflect.Value{typ:(*reflect.comm...

Read more...

Revision history for this message
John A Meinel (jameinel) wrote :

A few failures:

1) discarding obtainer method should probably be DEBUG level rather than INFO, we will always have a few of them, it is a bit noisy to always show them. :)

2) The actual failure is ... value *net.OpError = &net.OpError{Op:"listen", Net:"tcp", Addr:(*net.TCPAddr)(0xf8403397c0), Err:0x62} ("listen tcp 127.0.0.1:12345: address already in use")

We are trying to start a server, but something is already listening.

What surprises me a lot is the port that is chosen :12345. That looks very suspiciously like we are forcing the port number for the test server rather than using :0 and then propagating the port it starts on to the client side code.

3) I'm guessing the timing related bit is that we've stopped a service, and then start another service on the same port, but unless we've set SO_REUSE_ADDR we aren't guaranteed that the previous kernel data has been cleaned out enough to actually let us use it.

I think (2) is a far better solution for the test suite. Don't hard-code the port, just use :0 and figure out what port it actually connects to.

Revision history for this message
John A Meinel (jameinel) wrote :

It looks like I copied state/api/machiner/machiner_test.go which had 'localhost:12345' hardcoded.

We want to just use "localhost:0". The internals already know how to get the real port and hand the information off to the client side.

Changed in juju-core:
assignee: nobody → John A Meinel (jameinel)
importance: Undecided → Critical
status: New → In Progress
Go Bot (go-bot)
Changed in juju-core:
status: In Progress → Fix Committed
John A Meinel (jameinel)
summary: - intermittent test failure in upgrader.SetUpTest
+ intermittent test failure in upgrader.SetUpTest when using GOMAXPROCS>1
John A Meinel (jameinel)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.