apicaller worker waits forever

Bug #1732587 reported by Tim Penhey
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
Tim Penhey
2.2
Fix Released
High
Tim Penhey
2.3
Fix Released
High
Tim Penhey
2.4
Fix Released
High
Tim Penhey

Bug Description

With enough load on the apiserver and TCP traffic, it is possible for the api client to think that the TCP connection has happened, and the login sent, but it sits waiting forever for the login response.

Meanwhile the apiserver has restarted, probably due to an upgrade. The login request may still be in the TCP buffer and not yet read by the apiserver, so it hasn't yet incremented the waitgroup for pending API requests.

This leave the client, the agent in this case, stuck waiting forever.

This bug is also observable through 'juju-goroutines', and the stuck goroutine is in api.Open.

Revision history for this message
Tim Penhey (thumper) wrote :
Tim Penhey (thumper)
Changed in juju:
status: In Progress → Fix Committed
Tim Penhey (thumper)
description: updated
Revision history for this message
Tim Penhey (thumper) wrote :
Changed in juju:
status: Fix Committed → Fix Released
Revision history for this message
Paul Gear (paulgear) wrote :

It appears this bug is still affecting 2.3.7. Here's a representative goroutine dump: https://pastebin.canonical.com/p/BSpsTVzK7n/

Revision history for this message
Tim Penhey (thumper) wrote :
Changed in juju:
milestone: 2.3-rc1 → 2.5-beta1
status: Fix Released → In Progress
Tim Penhey (thumper)
Changed in juju:
status: In Progress → Fix Committed
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.