juju-core

i/o timeout from mongodb

Bug #1556961 reported by Andreas Hasenack on 2016-03-14

This bug report is a duplicate of: Bug #1597601: ERROR cannot deploy bundle: cannot deploy application: i/o timeout. Edit Remove

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	juju-core	Incomplete	Undecided	Unassigned

Bug Description

A landscape-driven cloud deployment failed and we noticed this in our juju client logs:

Mar 14 04:55:55 juju-sync-1 INFO Handling failure RequestError: read tcp 10.96.15.100:37017: i/o timeout (code: '')

We didn't retry that, and filed bug #1556937 about it.

10.96.15.100 is the state server, and 37017 is mongo's port. We don't talk to mongo directly, so that was an internal juju connection.

machine-0.log ends with these three lines:
2016-03-14 04:19:51 ERROR juju.worker.firewaller firewaller.go:439 failed to lookup "machine-3-lxc-5", skipping port change
2016-03-14 04:55:55 ERROR juju.state status.go:216 failed to write status history: read tcp 10.96.15.100:37017: i/o timeout
2016-03-14 04:55:56 ERROR juju.state.leadership manager.go:72 stopping leadership manager with error: read tcp 10.96.15.100:37017: i/o timeout

After that, all other units have error lines like this one:
unit-neutron-gateway-0[10502]: 2016-03-14 04:56:03 WARNING juju.worker.dependency engine.go:304 failed to start "uniter" manifold worker: dependency not available

Of note is that all-machines.log didn't get logs from all units, just one (!). I also spotted a rsyslog restart in /var/log/syslog:
Mar 14 04:55:59 albany rsyslogd: [origin software="rsyslogd" swVersion="7.4.4" x-pid="660130" x-info="http://www.rsyslog.com"] start

/var/log/syslog got quite big (over 300MB).

I'm attaching the relevant log files from the bootstrap node. This is from a CI job so I don't have the environment up still, but I do have logs from all units if you want them (https://ci.lscape.net/job/landscape-system-tests/1362/ for our reference).

Tags:

Revision history for this message

Andreas Hasenack (ahasenack) wrote on 2016-03-14:

bootstrap-node.tar.bz2 Edit (45.4 MiB, application/x-tar)

Revision history for this message

Andreas Hasenack (ahasenack) wrote on 2016-03-14:

juju-status.txt Edit (75.3 KiB, text/plain)

tags:

added: kanban-cross-team

🤖 Landscape Builder (landscape-builder) on 2016-03-14

tags:

removed: kanban-cross-team

Revision history for this message

Cheryl Jennings (cherylj) wrote on 2016-03-14:

I think this is caused by bug 1539656 (which was fixed in 1.25.4). I can verify from checking the unit logs, but I don't have access to the CI job to check. Can you add me?

Revision history for this message

Andreas Hasenack (ahasenack) wrote on 2016-03-14:

That tarball has juju logs from the units. The remaining logs that are in the CI job are files outside of /var/log/juju.

Revision history for this message

Cheryl Jennings (cherylj) wrote on 2016-03-15:

In going through the unit files, I cannot be 100% certain that bug #1539656 is the only issue happening here, although it certainly is one of them.

I'm going to mark this as incomplete, pending a recreate on 1.25.4+. I'll also add in some additional logging into 1.25.5 that should help aid debugging for this type of problem.

Revision history for this message

Cheryl Jennings (cherylj) wrote on 2016-03-15:

Restarting jujud on the state server *should* help in this case.

Changed in juju-core:
status:	New → Incomplete
milestone:	none → 1.25.5

Curtis Hovey (sinzui) on 2016-04-06

Changed in juju-core:
milestone:	1.25.5 → 1.25.6

Curtis Hovey (sinzui) on 2016-07-14

Changed in juju-core:
milestone:	1.25.6 → 1.25.7

Anastasia (anastasia-macmood) on 2016-07-18

Changed in juju-core:
milestone:	1.25.7 → none

Revision history for this message

Anastasia (anastasia-macmood) wrote on 2016-08-10:

I believe the root cause of io/timeout is fixed by https://bugs.launchpad.net/juju-core/+bug/1597601

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1597601 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.