Bug #1613992 “1.25.6 “ERROR juju.worker.uniter.filter filter.go:...” : Series 1.25 : Bugs : juju-core

Anastasia (anastasia-macmood) on 2016-08-17

Changed in juju-core:
status:	New → Triaged
importance:	Undecided → Critical
milestone:	none → 1.25.7
milestone:	1.25.7 → 2.0-beta17
importance:	Critical → High

Canonical Juju QA Bot (juju-qa-bot) on 2016-08-23

affects:	juju-core → juju
Changed in juju:
milestone:	2.0-beta17 → none
milestone:	none → 2.0-beta17

Canonical Juju QA Bot (juju-qa-bot) on 2016-08-23

Changed in juju-core:
importance:	Undecided → Critical
status:	New → Triaged

Revision history for this message

Charles Butler (lazypower) wrote on 2016-08-29:

#1

Additional supporting logs:

http://paste.ubuntu.com/23107655/
http://paste.ubuntu.com/23107656/

Anastasia (anastasia-macmood) on 2016-09-01

Changed in juju:
milestone:	2.0-beta17 → 2.0-beta18

Tim Penhey (thumper) on 2016-09-01

no longer affects:

juju

Björn Tillenius (bjornt) on 2016-09-05

tags:	added: landscape
tags:	added: kanban-cross-team

🤖 Landscape Builder (landscape-builder) on 2016-09-05

tags:

removed: kanban-cross-team

Anastasia (anastasia-macmood) on 2016-09-05

Changed in juju-core:
status:	Triaged → Won't Fix
importance:	Critical → Undecided

Chris Gregan (cgregan) on 2016-09-06

tags:

added: cdo-qa-blocker

Revision history for this message

Free Ekanayaka (free.ekanayaka) wrote on 2016-09-06:

#2

I experienced the same issue.

Revision history for this message

Junien F (axino) wrote on 2016-09-09:

#3

So did I

Revision history for this message

Paul Larson (pwlars) wrote on 2016-09-09:

#4

After several attempts at redeploying to work around this, I destroyed my environment, rebootstrapped did a juju upgrade-juju --version=1.25.5 and redeployed. I'm not sure if I just got lucky or not, but this did work for me to get it going again.

Revision history for this message

Yangzheng Bai (zoy) wrote on 2016-09-16:

#5

Experience the same problem after remove old service and redeploy with a newer version. Unless destroy the whole environment, nothing else can solve this tomb dying problem. juju version is 1.25.0-trusty-amd64

unit-wrk9-0[14647]: 2016-09-16 19:49:01 ERROR juju.worker.uniter.filter filter.go:137 tomb: dying

Revision history for this message

Yangzheng Bai (zoy) wrote on 2016-09-16:

#6

juju agent-version: 1.25.6, service status message: Waiting for agent initialization to finish

This problem is not related to any specific charm. We hit it with 3 different charms. The problem arises when we try to remove a successfully deployed service and redeploy a modified one. Our environment is latest xenial and latest juju. It happens on both arm aarch64 and intel xeon.

The juju server version 1.25.0-trusty-amd64 is different from juju agent-version: 1.25.6

We will try to use --no-auto-upgrade to bootstrap environment again.

logs:
unit-ngi4-0[13689]: 2016-09-16 21:32:31 WARNING juju.worker.uniter.operation leader.go:115 we should run a leader-deposed hook here, but we can't yet
unit-ngi4-0[13689]: 2016-09-16 21:32:33 WARNING juju.worker.uniter.operation metrics.go:50 failed to create a metric reader: failed to open spool directory "/var/lib/juju/agents/unit-ngi4-0/state/spool/metrics": stat /var/lib/juju/agents/unit-ngi4-0/state/spool/metrics: no such file or directory
unit-ngi4-0[13689]: 2016-09-16 21:32:33 ERROR juju.worker.uniter.filter filter.go:137 tomb: dying
unit-ngi4-0[13689]: 2016-09-16 21:32:33 ERROR juju.api.watcher watcher.go:84 error trying to stop watcher: crypto/tls: use of closed connection
unit-ngi4-0[13689]: 2016-09-16 21:32:33 ERROR juju.api.watcher watcher.go:84 error trying to stop watcher: connection is shut down
unit-ngi4-0[13689]: 2016-09-16 21:32:33 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
unit-ngi4-0[13689]: message repeated 5 times: [ 2016-09-16 21:32:33 ERROR juju.api.watcher watcher.go:84 error trying to stop watcher: connection is shut down]
machine-0: message repeated 5 times: [2016-09-16 21:30:38 ERROR juju.rpc server.go:573 error writing response: write tcp 10.118.zzz.xxx:17070->10.118.zzz.aaa:49670: write: connection reset by peer]
machine-0: 2016-09-16 21:32:33 ERROR juju.rpc server.go:573 error writing response: write tcp 10.118.zzz.xxx:17070->10.118.zzz.bbb:36298: write: broken pipe

juju agent-version: 1.25.6, service status message: Waiting for agent initialization to finish

This problem is not related to any specific charm. We hit it with 3 different charms. The problem arises when we try to remove a successfully deployed service and redeploy a modified one. Our environment is latest xenial and latest juju. It happens on both arm aarch64 and intel xeon.

The juju server version 1.25.0-trusty-amd64 is different from juju agent-version: 1.25.6

We will try to use --no-auto-upgrade to bootstrap environment again.

logs:
unit-ngi4-0[13689]: 2016-09-16 21:32:31 WARNING juju.worker.uniter.operation leader.go:115 we should run a leader-deposed hook here, but we can't yet
unit-ngi4-0[13689]: 2016-09-16 21:32:33 WARNING juju.worker.uniter.operation metrics.go:50 failed to create a metric reader: failed to open spool directory "/var/lib/juju/agents/unit-ngi4-0/state/spool/metrics": stat /var/lib/juju/agents/unit-ngi4-0/state/spool/metrics: no such file or directory
unit-ngi4-0[13689]: 2016-09-16 21:32:33 ERROR juju.worker.uniter.filter filter.go:137 tomb: dying
unit-ngi4-0[13689]: 2016-09-16 21:32:33 ERROR juju.api.watcher watcher.go:84 error trying to stop watcher: crypto/tls: use of closed connection
unit-ngi4-0[13689]: 2016-09-16 21:32:33 ERROR juju.api.watcher watcher.go:84 error trying to stop watcher: connection is shut down
unit-ngi4-0[13689]: 2016-09-16 21:32:33 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
unit-ngi4-0[13689]: message repeated 5 times: [ 2016-09-16 21:32:33 ERROR juju.api.watcher watcher.go:84 error trying to stop watcher: connection is shut down]
machine-0: message repeated 5 times: [2016-09-16 21:30:38 ERROR juju.rpc server.go:573 error writing response: write tcp 10.118.zzz.xxx:17070->10.118.zzz.aaa:49670: write: connection reset by peer]
machine-0: 2016-09-16 21:32:33 ERROR juju.rpc server.go:573 error writing response: write tcp 10.118.zzz.xxx:17070->10.118.zzz.bbb:36298: write: broken pipe

Barry Price (barryprice) on 2016-09-21

tags:	added: caonical-is
tags:	added: canonical-is removed: caonical-is

Anastasia (anastasia-macmood) on 2016-09-21

Changed in juju-core:
status:	Won't Fix → Triaged
importance:	Undecided → Critical

Revision history for this message

Anastasia (anastasia-macmood) wrote on 2016-09-23:

#7

According to wgrant, the description of this failure matches bug # 1626304 as well the symptoms that blr and thomi saw. This may be a duplicate \o/

Revision history for this message

Brad Marshall (brad-marshall) wrote on 2016-11-22:

#8

I've seen this on a Openstack deployment with 1.25.6, a certain small subset of our agents would just enter a failed agent state, with the tomb dying error message. Since I've upgraded to 1.25.8 on this deploy I haven't seen it since.

Revision history for this message

Anastasia (anastasia-macmood) wrote on 2016-11-22:

#9

Based on Brad's comment, this may have been fixed in our highly anticipated next 1.25.x \o/

Changed in juju-core:
status:	Triaged → Fix Committed

Anastasia (anastasia-macmood) on 2017-01-10

Changed in juju-core:
status:	Fix Committed → Fix Released

Affects		Status	Importance	Assigned to	Milestone
	juju-core	Fix Released	Critical	Unassigned
	1.25	Fix Released	Critical	Unassigned	juju-core 1.25.8

juju-core

1.25.6 "ERROR juju.worker.uniter.filter filter.go:137 tomb: dying"

Bug Description

Other bug subscribers

Remote bug watches