Comment 3 for bug 1669834

Revision history for this message
Casey Marshall (cmars) wrote :

Today I received a nagios alert which seems to have been caused by nrpe getting restarted while it got polled (that's the best explanation I can rationalize it with).

The nrpe unit agent log showed:

2017-10-23 12:05:46 ERROR juju.api monitor.go:59 health ping timed out after 30s
2017-10-23 12:05:46 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: api connection broken unexpectedly
2017-10-23 12:06:01 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: try again (try again)
2017-10-23 12:06:15 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: try again (try again)
2017-10-23 12:06:47 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: try again (try again)
2017-10-23 12:07:13 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: unknown model: "49b0ef05-5460-4ede-8210-824563552d39" (model not found)
2017-10-23 12:07:25 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: login for "unit-nrpe-103" blocked because upgrade in progress
2017-10-23 12:07:30 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: login for "unit-nrpe-103" blocked because upgrade in progress
2017-10-23 12:07:36 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: login for "unit-nrpe-103" blocked because upgrade in progress
2017-10-23 12:07:41 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: login for "unit-nrpe-103" blocked because upgrade in progress
2017-10-23 12:07:53 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: try again (try again)
2017-10-23 12:08:02 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: try again (try again)
2017-10-23 12:08:11 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: try again (try again)
2017-10-23 12:08:20 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: try again (try again)
2017-10-23 12:08:29 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: try again (try again)
2017-10-23 12:08:37 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: try again (try again)
2017-10-23 12:09:05 DEBUG config-changed Hit:1 http://prodstack-zone-2.clouds.archive.ubuntu.com/ubuntu xenial InRelease
2017-10-23 12:09:05 DEBUG config-changed Get:2 http://prodstack-zone-2.clouds.archive.ubuntu.com/ubuntu xenial-updates InRelease [102 kB]
2017-10-23 12:09:05 DEBUG config-changed Get:3 http://security.ubuntu.com/ubuntu xenial-security InRelease [102 kB]
2017-10-23 12:09:05 DEBUG config-changed Ign:4 http://archive.admin.canonical.com/ubuntu xenial-cat InRelease
2017-10-23 12:09:05 DEBUG config-changed Hit:5 http://ppa.launchpad.net/telegraf-devs/ppa/ubuntu xenial InRelease
2017-10-23 12:09:05 DEBUG config-changed Get:6 http://prodstack-zone-2.clouds.archive.ubuntu.com/ubuntu xenial-backports InRelease [102 kB]
2017-10-23 12:09:05 DEBUG config-changed Hit:7 http://archive.admin.canonical.com/ubuntu xenial-cat Release
2017-10-23 12:09:07 DEBUG config-changed Fetched 306 kB in 0s (725 kB/s)
2017-10-23 12:09:08 DEBUG config-changed Reading package lists...
2017-10-23 12:09:08 INFO juju-log Installing ['nagios-nrpe-server', 'nagios-plugins-basic', 'nagios-plugins-standard', 'rsync'] with options: ['--option=Dpkg::Options::=--force-confold']
2017-10-23 12:09:08 DEBUG config-changed Reading package lists...
2017-10-23 12:09:08 DEBUG config-changed Building dependency tree...
2017-10-23 12:09:08 DEBUG config-changed Reading state information...
2017-10-23 12:09:08 DEBUG config-changed nagios-plugins-basic is already the newest version (2.1.2-2ubuntu2).
2017-10-23 12:09:08 DEBUG config-changed nagios-plugins-standard is already the newest version (2.1.2-2ubuntu2).
2017-10-23 12:09:08 DEBUG config-changed rsync is already the newest version (3.1.1-3ubuntu1).
2017-10-23 12:09:08 DEBUG config-changed nagios-nrpe-server is already the newest version (2.15-1ubuntu1.1).
2017-10-23 12:09:08 DEBUG config-changed 0 upgraded, 0 newly installed, 0 to remove and 38 not upgraded.
2017-10-23 12:09:08 INFO juju-log /usr/bin/rsync -r --executability /var/lib/juju/agents/unit-nrpe-103/charm/files/plugins /usr/local/lib/nagios/
2017-10-23 12:09:08 DEBUG config-changed inactive
2017-10-23 12:09:08 DEBUG config-changed Failed to start nrpe-install.service: Unit nrpe-install.service not found.
2017-10-23 12:09:08 DEBUG config-changed inactive
2017-10-23 12:09:08 DEBUG config-changed inactive

The machine agent log showed:

2017-10-23 12:06:05 ERROR juju.api monitor.go:59 health ping timed out after 30s
2017-10-23 12:06:05 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: api connection broken unexpectedly
2017-10-23 12:06:05 ERROR juju.worker runner.go:381 fatal "12-container-watcher": worker "12-container-watcher" exited: connection is shut down
2017-10-23 12:06:05 ERROR juju.worker runner.go:381 fatal "stateconverter": connection is shut down
2017-10-23 12:06:16 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: try again (try again)
2017-10-23 12:06:47 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: try again (try again)
2017-10-23 12:07:12 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: unknown model: "49b0ef05-5460-4ede-8210-824563552d39" (model not found)
2017-10-23 12:07:18 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: login for "machine-12" blocked because upgrade in progress
2017-10-23 12:07:21 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: login for "machine-12" blocked because upgrade in progress
2017-10-23 12:07:24 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: login for "machine-12" blocked because upgrade in progress
2017-10-23 12:07:28 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: login for "machine-12" blocked because upgrade in progress
2017-10-23 12:07:33 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: login for "machine-12" blocked because upgrade in progress
2017-10-23 12:07:38 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: login for "machine-12" blocked because upgrade in progress
2017-10-23 12:07:42 ERROR juju.worker.dependency engine.go:546 "api-caller" manifold worker returned unexpected error: cannot open api: login for "machine-12" blocked because upgrade in progress

Discussed w/webops, they confirm that controller machines were restarted around the time of the health timeout & API connection errors. They've confirmed though that there was no upgrade in progress, Juju was already at 2.2.4 on the controllers & agents.