debug-log showing old messages
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
pyjuju |
Triaged
|
Low
|
Unassigned |
Bug Description
After running a charm which accidentally used up all the disk space on a virtual instance the debug-log was spammed with 'No space left on device' error messages.
Four hours have gone by and the debug-log is still reporting these errors reducing the usefulness of this log file. Unfortunately, errors will happen and juju should deal with this instances spamming the debug-log more gracefully.
What I did:
I deployed a Ceph charm on an Openstack m1.tiny instance which then tried to provision an example mon daemon in the instance's /tmp directory. Once I found out what the problem was, I destroyed the service and terminated the machine.
What I expect to see:
Recent debug log messages
What I actually see:
Debug log messages from four hours ago.
Example output (notice the timestamps):
agy@agy-laptop:~$ juju debug-log
2012-09-17 20:22:44,871 WARNING S3 API calls not using secure transport
2012-09-17 20:22:44,871 INFO Connecting to environment...
2012-09-17 20:22:48,569 INFO Connected to environment.
2012-09-17 20:22:48,570 INFO Enabling distributed debug log.
2012-09-17 20:22:48,663 INFO Tailing logs - Ctrl-C to stop.
2012-09-17 16:13:38,221 unit:ceph/1: twisted ERROR: Logged from file log.py, line 532
2012-09-17 16:13:38,211 unit:ceph/1: twisted ERROR: Logged from file log.py, line 532
2012-09-17 16:13:38,143 unit:ceph/1: twisted ERROR: self.stream.flush()
2012-09-17 16:13:38,233 unit:ceph/1: twisted ERROR: Logged from file log.py, line 532
2012-09-17 16:13:38,223 unit:ceph/1: twisted ERROR: Traceback (most recent call last):
2012-09-17 16:13:38,235 unit:ceph/1: twisted ERROR: File "/usr/lib/
2012-09-17 16:13:38,246 unit:ceph/1: twisted ERROR: self.flush()
2012-09-17 16:13:38,258 unit:ceph/1: twisted ERROR: File "/usr/lib/
2012-09-17 16:13:38,269 unit:ceph/1: twisted ERROR: self.stream.flush()
2012-09-17 16:13:38,290 unit:ceph/1: twisted ERROR: Logged from file log.py, line 532
2012-09-17 16:13:38,280 unit:ceph/1: twisted ERROR: IOError: [Errno 28] No space left on device
2012-09-17 16:13:38,292 unit:ceph/1: twisted ERROR: Traceback (most recent call last):
2012-09-17 16:13:38,300 unit:ceph/1: twisted ERROR: IOError: [Errno 28] No space left on device
2012-09-17 16:13:38,302 unit:ceph/1: twisted ERROR: Logged from file log.py, line 532
2012-09-17 16:13:38,292 unit:ceph/1: twisted ERROR: Logged from file log.py, line 532
[...]
description: | updated |
Changed in juju: | |
importance: | Undecided → High |
Changed in juju: | |
milestone: | none → 0.8 |
Changed in juju: | |
importance: | High → Low |
This happens because the way debug-log works it is just a buffer with a 'last-read' pointer maintained by the client.. so until the client gets through them all, they'll keep spamming.
Beacuse of the distributed nature of juju, I think this approach will not work at all. A more real-time approach where agents just send logs to a log-rebroadcaster daemon which keeps a fixed ring buffer which clients can follow/tail makes a lot more sense.