panic: Session already closed in provisioner tests

Bug #1394223 reported by Aaron Bentley
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Medium
John A Meinel
2.3
Fix Released
Medium
John A Meinel

Bug Description

This may be related to 1305014. We have seen it in two recent test runs:
http://reports.vapour.ws/releases/issue/55565034749a5650fa59f34c

Revision history for this message
Aaron Bentley (abentley) wrote :
John George (jog)
tags: added: unit-tests
Curtis Hovey (sinzui)
description: updated
Revision history for this message
Cheryl Jennings (cherylj) wrote :

The latest matches in CI are in state/presence and are covered in bug #1588574

Changed in juju-core:
importance: Medium → Critical
tags: added: blocker
Changed in juju-core:
milestone: none → 2.0-beta9
Revision history for this message
Cheryl Jennings (cherylj) wrote :

The latest matches are for a different failure (see comment #2). Can the regex for the outcome analyzer be updated?

tags: removed: blocker
Changed in juju-core:
importance: Critical → Medium
Revision history for this message
Aaron Bentley (abentley) wrote :

It's not clear to me that there's a useful difference in the output. Looking at two examples:

http://reports.vapour.ws/releases/2506/job/run-unit-tests-trusty-ppc64el/attempt/2701#highlight
http://reports.vapour.ws/releases/4031/job/run-unit-tests-centos7-amd64/attempt/952#highlight

Both say "panic: Session already closed", and seem to trace back to NewWatcher. How would a regex tell them apart?

Revision history for this message
Cheryl Jennings (cherylj) wrote :

Both of the examples you gave are for the state/presence failure. The title of this bug implies the original failures were seen in worker/provisioner, and I see that in this run, the failure was in worker/provisioner: http://reports.vapour.ws/releases/2601/job/run-unit-tests-win2012-amd64/attempt/338

TestProvisionerRetriesTransientErrors is implicated in the trace back on that run.

Maybe also specifying the failing package would help differentiate?

Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0-beta9 → 2.0-beta10
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0-beta10 → 2.0-beta11
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0-beta11 → 2.0-beta12
Changed in juju-core:
milestone: 2.0-beta12 → 2.0-beta13
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0-beta13 → 2.0-beta14
Curtis Hovey (sinzui)
Changed in juju-core:
milestone: 2.0-beta14 → none
Changed in juju:
status: New → Triaged
importance: Undecided → Medium
no longer affects: juju-core
Revision history for this message
John A Meinel (jameinel) wrote :

I looked at the failure from:
http://10.125.0.203:8080/job/RunUnittests-s390x/277/console

Which appears to have the fix that was landed for:
https://bugs.launchpad.net/juju/+bug/1754021

namely, we wait for 'hackyGoroutineDone' before we check m4.InstanceStatus which should mean that we won't tear down the connection for m3 until that goroutine is done.
However, a panic() could be obscuring the original error. Imagine that checkStartInstance failed with an error, which would then raise an exception, which would the never call close(thatsAllFolks) and never waits for hackyGoroutineDone.

Revision history for this message
John A Meinel (jameinel) wrote :

https://github.com/juju/juju/pull/8474

see also https://bugs.launchpad.net/juju/+bug/1754021

The above patch should make it so that we get to see the actual failure, rather than seeing a panic() during teardown.

Revision history for this message
John A Meinel (jameinel) wrote :

We should now not get the panic() though the underlying test failure is probably still there.

Changed in juju:
status: Triaged → In Progress
assignee: nobody → John A Meinel (jameinel)
milestone: none → 2.4-beta1
Revision history for this message
John A Meinel (jameinel) wrote :
Changed in juju:
milestone: 2.4-beta1 → none
Revision history for this message
Anastasia (anastasia-macmood) wrote :

Since this code landed in 2.3.5, I am pretty sure it is also present in 2.4 and subsequent branches. As this is also a test fix, I'll mark this as Fix Released.

Changed in juju:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.