Change zk logging configuration

Bug #958312 reported by Kapil Thangavelu
36
This bug affects 5 people
Affects Status Importance Assigned to Milestone
pyjuju
Fix Released
Critical
Kapil Thangavelu
juju (Ubuntu)
Fix Released
Critical
Unassigned
Precise
Fix Released
Critical
Unassigned

Bug Description

== SRU Justifcation ==

=== Impact ===
This bug can cause juju's local provider to fill up the local disk very rapidly after a reboot.

=== Test Case ===
1. Create a local environment in ~/.juju/environments.yaml, something like this:

environments:
  local:
    type: local
    data-dir: /home/myuser/.juju/data
    default-series: precise

2. juju bootstrap
3. reboot the machine
4. check /home/myuser/.juju/data/myuser-local/machine-agent.log .. if it is streaming errors, the system is affected.
5. juju destroy-environment
6. update the package
7. repeat steps 1 - 4. If the logs are no longer streaming, the bug is fixed.

=== Dev Fix ===

This is fixed in upstream commit 533 by suppressing the logs from libzookeeper.

=== Regression Potential ===
Very low, this simply suppresses the logs that come streaming out of libzookeeper, and they are not useful in most cases. This does mean that in the rare cases where client error logs would be useful, they are suppressed.

== Original Bug Description ==

The verbosity and frequency of the zookeeper log is considerable, use a pipe and filter on level and rate. Lacking this its possible for the log to fill up the disks on the machine for a disconnected agent, happeneRd to me on local provider, ugh. Just to be clear this on the client connection logging, not the server.

description: updated
Changed in juju:
milestone: none → florence
importance: Undecided → High
description: updated
Revision history for this message
Kapil Thangavelu (hazmat) wrote :

the switch to managed zk client mitigates this significantly as typicaly it was the error stream from an expired session that cause the problem.

Changed in juju:
importance: High → Medium
milestone: florence → galapagos
Changed in juju:
assignee: nobody → Kapil Thangavelu (hazmat)
status: New → In Progress
Changed in juju:
status: In Progress → Fix Released
Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Kapil, did you forget to push your branch?

Changed in juju:
importance: Medium → Critical
Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Might be worth landing in precise before the full galapagos release.

Changed in juju (Ubuntu Precise):
status: New → Triaged
Changed in juju (Ubuntu):
status: New → Triaged
importance: Undecided → Critical
Changed in juju (Ubuntu Precise):
importance: Undecided → High
importance: High → Critical
description: updated
Changed in juju (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
Chris Halse Rogers (raof) wrote : Please test proposed package

Hello Kapil, or anyone else affected,

Accepted juju into precise-proposed. The package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in juju (Ubuntu Precise):
status: Triaged → Fix Committed
tags: added: verification-needed
Revision history for this message
James Page (james-page) wrote :

Without -proposed:

2012-06-26 07:26:14,259:855(0x7f4104cef700):ZOO_ERROR@handle_socket_error_msg@1579: Socket [192.168.122.1:54725] zk retcode=-4, errno=111(Connection refused): server refused to accept the client
2012-06-26 07:26:17,690:855(0x7f4104cef700):ZOO_WARN@zookeeper_interest@1461: Exceeded deadline by 97ms
2012-06-26 07:26:17,690:855(0x7f4104cef700):ZOO_ERROR@handle_socket_error_msg@1579: Socket [192.168.122.1:54725] zk retcode=-4, errno=111(Connection refused): server refused to accept the client
2012-06-26 07:26:21,027:855(0x7f4104cef700):ZOO_ERROR@handle_socket_error_msg@1579: Socket [192.168.122.1:54725] zk retcode=-4, errno=111(Connection refused): server refused to accept the client

with -proposed:

2012-06-26 07:26:52,349: juju.agents.machine@INFO: Machine agent started id:0
2012-06-26 07:27:42,103: twisted@ERROR: Unhandled error in Deferred:
2012-06-26 07:27:42,104: twisted@ERROR: Unhandled Error
Traceback (most recent call last):
Failure: txzookeeper.client.ConnectionTimeoutException: could not connect before timeout

Looks good to me

tags: added: verification-done
removed: verification-needed
James Page (james-page)
Changed in juju (Ubuntu Precise):
milestone: none → ubuntu-12.04.1
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package juju - 0.5+bzr531-0ubuntu1.2

---------------
juju (0.5+bzr531-0ubuntu1.2) precise-proposed; urgency=low

  * d/p/upstream-541-542.patch: fix shell script to make proposed work
    for local provider. (LP: #1007657)

juju (0.5+bzr531-0ubuntu1.1) precise-proposed; urgency=low

  * d/p/upstream-532.patch: use proper whitespace separation in
    relation-ids. (LP: #988065)
  * d/p/upstream-533.patch: workaround to suppress Zookeeper logging
    which goes out of control in local provider on Reboot.(LP: #958312)
  * d/p/upstream-534.patch: fix to stop juju-log and relation-set commands
    outputting "{}" to stdout, which can break some programs and makes it
    difficult to use them in a debug-hooks session. (LP: #915506)
  * d/p/upstream-537.patch: make HTTPS apt sources usable inside local
    provider. (LP: #993034)
  * d/p/upstream-538.patch: Add support to install juju from proposed
    pocket to test proposed SRU's. (LP: #926550)
 -- Clint Byrum <email address hidden> Mon, 18 Jun 2012 14:00:56 -0700

Changed in juju (Ubuntu Precise):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers