console.log grows indefinitely

Bug #832507 reported by Michael Chapman on 2011-08-24
114
This bug affects 25 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Markus Zoeller (markus_z)
OpenStack Security Advisory
Undecided
Unassigned
libvirt (Ubuntu)
High
Jamie Strandboge
nova (Ubuntu)
High
Robie Basak
qemu-kvm (Ubuntu)
High
Tony Breeds

Bug Description

KVM takes everything from stdout and prints it to console.log. This does not appear to have a size limit, so if a user (mistakenly or otherwise) sends a lot of data to stdout, the console.log file can fill the entire disk of the compute node quite quickly.

Thierry Carrez (ttx) on 2011-08-24
Changed in nova:
importance: Undecided → Low
status: New → Confirmed
security vulnerability: no → yes
Dave Walker (davewalker) wrote :

It seems to me we need to use a ring buffer for the log rather than a standard file. I don't think the hypervisors support this natively, meaning it probably needs to be an opt-in option.

Dave Walker (davewalker) on 2011-08-29
Changed in nova (Ubuntu):
importance: Undecided → High
status: New → Confirmed
Dave Walker (davewalker) wrote :

* Using named pipes / fifo would seem to re-introduce the same issue differently.
* Logging to unix socket would mean there needs to be a listener within nova to suck in the log to file, but means it's handled in userspace (good).
* Using a non-standard kernel module called emlog is interesting, but would need to be opt-in due to it's non-mainline nature.

tags: added: server-o-rs
Thierry Carrez (ttx) on 2011-09-01
Changed in nova (Ubuntu):
assignee: nobody → Dave Walker (davewalker)
status: Confirmed → In Progress
Changed in nova:
assignee: nobody → Dave Walker (davewalker)
status: Confirmed → In Progress
Dave Walker (davewalker) on 2011-09-03
Changed in nova (Ubuntu):
milestone: none → ubuntu-11.10-beta-2
Soren Hansen (soren) wrote :

It turns out that kvm gracefully handles it a listener on a named pipe close()s its connection and opens it again (and buffers whatever output would have been read in the mean time). This should make this a much simpler fix.

Robie Basak (racb) wrote :

The plan is:

1) Create a FIFO, open it persistently and use this as the console log destination
2) Write a handler that will write to a ring buffer on disk
3) Periodically read data out of the FIFO and give it to the handler
4) Also do step 3 before processing get_console_output
5) get_console_output now needs to read through the ring buffer implementation
6) Reopen the FIFO when Nova is restarted

tags: added: rls-mgr-o-tracking
Dave Walker (davewalker) on 2011-09-23
Changed in nova (Ubuntu):
milestone: ubuntu-11.10-beta-2 → ubuntu-11.10
Robie Basak (racb) wrote :

What I have so far: everything except point 6 in the plan above.

Problems:

1) libvirt creates an AppArmor profile for console.fifo and not console.fifo.{in,out}.

Other things to check:

1) Console logging gets resumed correctly on restart.
2) Correct management of live migration.
3) Correct management of rescue mode.

Patch-in-progress attached, along with AppArmor details to get the dynamic profile fixed. I can't see how to attach multiple files so I hope a tarball is OK.

Changed in nova (Ubuntu):
assignee: Dave Walker (davewalker) → Jamie Strandboge (jdstrand)
tags: added: apparmor
Changed in nova (Ubuntu):
assignee: Jamie Strandboge (jdstrand) → Dave Walker (davewalker)
Changed in libvirt (Ubuntu):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Jamie Strandboge (jdstrand)
milestone: none → ubuntu-11.10
Dave Walker (davewalker) on 2011-09-23
Changed in nova (Ubuntu):
assignee: Dave Walker (davewalker) → Robie Basak (racb)
Changed in nova:
assignee: Dave Walker (davewalker) → nobody
Dave Walker (davewalker) on 2011-09-26
Changed in nova:
assignee: nobody → Dave Walker (davewalker)
Thierry Carrez (ttx) on 2011-09-26
Changed in nova:
assignee: Dave Walker (davewalker) → Robie Basak (racb)
Changed in libvirt (Ubuntu):
status: Triaged → In Progress
Jamie Strandboge (jdstrand) wrote :

I now have an upstreamable patch that I will be uploading to Ubuntu shortly. It adds tests to the build suite and passes QRT (with the newly ..._console_pipe() test):
Description: fix AppArmor driver for pipe character devices
 The AppArmor security driver adds only the path specified in the domain XML
 for character devices of type 'pipe'. It should be using <path>.in and
 <path>.out. We do this by creating a new vah_add_file_chardev() and use
 it for char devices instead of vah_add_file(). Also adjust valid_path() to
 accept S_FIFO (since qemu chardevs of type 'pipe' use fifos).

Changed in libvirt (Ubuntu):
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package libvirt - 0.9.2-4ubuntu14

---------------
libvirt (0.9.2-4ubuntu14) oneiric; urgency=low

  * debian/patches/lp832507.patch: update virt-aa-helper to use the correct
    paths for character devices that are pipes. This can be removed in
    0.9.7. (LP: #832507)
 -- Jamie Strandboge <email address hidden> Tue, 27 Sep 2011 13:18:28 -0500

Changed in libvirt (Ubuntu):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nova - 2011.3-0ubuntu4

---------------
nova (2011.3-0ubuntu4) oneiric; urgency=low

  [James Page]
  * debian/nova-common.postinst:
    - Exclude mounted LXC rootfs filesystems within /var/lib/nova from
      user/group ownership changes (LP: #861260).
    - Ensure that primary group for 'nova' user is 'nova' so that files
      created by this user have the correct group ownership.

  [Adam Gandelman]
  * debian/nova-common.postinst: Restrict permissions of /var/log/nova
    (LP: #862816)

  [Ante Karamatic]
  * Add /usr/sbin/ietadm to sudoers (LP: #861547)
  * debian/control: Fix typo in Vcs-Bzr

  [Chuck Short]
  * debian/patches/backport-libvirt-console-pipe.patch:
    Move console.log to a ringbuffer so that the console.log
    keeps filling up. (LP: #832507)
  * debian/patches/backport-lxc-container-console-fix.patch:
    Make euca-get-console-output usable for LXC containers.
    (LP: #832159)
  * debian/patches/backport-snapshot-cleanup.patch:
    Enforce snapshot cleanup. (LP: #861582).
  * debian/patches/fix-lp863305-images-permission.patch:
    Fix image access control. (LP: #863305)
 -- Chuck Short <email address hidden> Fri, 30 Sep 2011 15:21:56 -0400

Changed in nova (Ubuntu):
status: In Progress → Fix Released
Thierry Carrez (ttx) wrote :
Thierry Carrez (ttx) wrote :

The review has gone stale... What's the status on this ? You have the fix in Ubuntu but it was refused upstream ? Or is that a different fix that you have in Ubuntu ?

Thierry Carrez (ttx) on 2012-01-19
Changed in nova:
importance: Low → Medium
Robie Basak (racb) wrote :

Fixed in Ubuntu, refused upstream.

Changed in nova:
assignee: Robie Basak (racb) → nobody
status: In Progress → Confirmed
Vish Ishaya (vishvananda) wrote :

review was here:

https://review.openstack.org/#change,706

Last request was if the ringbuffer handling could be done by a separate helper binary so nova-compute could go down for a bit without locking up instances.

Fix proposed to branch: master
Review: https://review.openstack.org/4932

Changed in nova:
assignee: nobody → Chuck Short (zulcss)
status: Confirmed → In Progress
Changed in qemu-kvm (Ubuntu):
status: New → Confirmed
importance: Undecided → High

Fix proposed to branch: master
Review: https://review.openstack.org/5873

Fix proposed to branch: master
Review: https://review.openstack.org/5964

Serge Hallyn (serge-hallyn) wrote :

Marking invalid in qemu-kvm assuming there is nothing to do there. Please switch back and rebuke me if I misunderstood.

Changed in qemu-kvm (Ubuntu):
status: Confirmed → Invalid
Scott Moser (smoser) wrote :

Serge,
  The reason for the qemu-kvm task is that we think qemu-kvm is really the ultimate right place to add a '-serial ringbuffer:640k,file=/path/to/file' flag.
  All the other attempts are more hacky, but if upstream kvm had this , libvirt could expose it, and openstack could use it.
  I do not know whether or not it would be accepted upstream.

Changed in qemu-kvm (Ubuntu):
status: Invalid → Triaged

> Serge,
> The reason for the qemu-kvm task is that we think qemu-kvm is really the ultimate right place to add a '-serial ringbuffer:640k,file=/path/to/file' flag.
> All the other attempts are more hacky, but if upstream kvm had this , libvirt could expose it, and openstack could use it.
> I do not know whether or not it would be accepted upstream.

Thanks, Scott.

It seems a reasonable thing for upstream to accept. We have workarounds for
precise, right? (If not, then I should be giving this a shot right now)
If so I'll taking a stab at this after release.

Robie Basak (racb) wrote :

I think this would need a libringbuffer as nova would need to read the same file in get_console_output. When I first looked at this in September, I could not find such a thing, or any accepted on-disk format. The python code in my patch could be a good starting point, but there a couple of things I think needed to be added. First, a magic to identify the file; and second, some kind of mutex or locking mechanism to cover the head and tail pointers and prevent slow readers from reading past the join.

Scott Moser (smoser) wrote :

On Wed, 11 Apr 2012, Serge Hallyn wrote:

> > Serge,
> > The reason for the qemu-kvm task is that we think qemu-kvm is really the ultimate right place to add a '-serial ringbuffer:640k,file=/path/to/file' flag.
> > All the other attempts are more hacky, but if upstream kvm had this , libvirt could expose it, and openstack could use it.
> > I do not know whether or not it would be accepted upstream.
>
> Thanks, Scott.
>
> It seems a reasonable thing for upstream to accept. We have workarounds for
> precise, right? (If not, then I should be giving this a shot right now)
> If so I'll taking a stab at this after release.

We don't have a reasonable work around really.
in 11.10 we had one that seemed to work.
in 12.04 that solution ate 100% cpu (i think) so we backed it out, i
think.

A "real" solution would be nice.

Scott

Thierry Carrez (ttx) wrote :

I would really like us to find a "real" (and upstreamable) solution for this, but I lack the KVM/libvirt expertise to make it happen.
Subscribing Daniel Berrange to see if he has another idea.

Daniel Berrange (berrange) wrote :

> > The reason for the qemu-kvm task is that we think qemu-kvm is really the ultimate right place to add a '-serial ringbuffer:640k,file=/path/to/file' flag.
> > All the other attempts are more hacky, but if upstream kvm had this , libvirt could expose it, and openstack could use it.
> > I do not know whether or not it would be accepted upstream.

This is an interesting idea & worth proposing to QEMU upstream to see what their feelings are on this - with this kind of concept, their reaction can be quite unpredictable, so I can't say more than a 50/50 chance they'll go for it. The reason I think they might not go for it, is that it implements just one out of many potential different policies. eg, a viable alternative would be to rotate log files periodically instead of using a ring buffer.

If KVM doesn't care todo this, from a libvirt POV, I have long imagined the need for a "libvirt_vmlogd" daemon which would run independently of libvirtd or QEMU. The QEMU guests would be configured with either a PTY or more likely a UNIX socket (eg '-serial unix:/var/lib/libvirt/qemu/serial0.socket'). The libvirt_vmlogd would automatically connect to the sockets as each guest was launched, and log the data according to some policy it is configured with, and handle log rotation / expiration etc.

For the sake of the Nova security issue, I think it'd be wise to implement a fix in Nova regardless, since both the upstream approaches could take some time.

I don't understand what the file argument would mean. Once you write() to a file, QEMU no longer can implement a ring buffer (it cannot discard written data). You would need to do something like the following:

qemu -chardev memchr,max-capacity=640k,id=foo -serial chardev:foo

And then introduce QMP commands like:

{ 'command': 'memchr-read', 'arguments': { 'chardev': 'str', 'size': 'int' }, 'returns': 'str' }

We already have a memory character device that we don't expose externally yet. It's just a matter of implementing a ring queue behavior and plumbing things up. I think it's entirely reasonable and perhaps would even be something that libvirt would prefer to use in the long term over ptys.

Would also be good for unit testing.

Robie Basak (racb) wrote :

Anthony,

The file would be a disk-based ringbuffer. There would need to be a well-known disk-based ringbuffer format, which currently doesn't exist. Perhaps a "libringbuffer" to encapsulate it. The format would need head and tail indexes and then the data, together with some thought for concurrent access (eg. a mutex which would require mmapping to use).

I imagine a libringbuffer which would expose methods to open a disk-based ringbuffer file, add data to the end of the buffer and read data out of it.

Daniel Berrange (berrange) wrote :

Having examined the idea of the libvirt_consoled a bit more, I think it is not actually required. It is possible to get good support for console logging, max bounded size, rollover, & secure remote access, simply by dropping in the standard 'conserver' daemon with a suitable configuration file. There'd be no need for any new features in either libvirt or QEMU for this to work. All nova would need todo would be update the conserver.cf file whenever a VM is started or stopped. Reusing existing mature projects like conserver is perferrable to reinventing the wheel with our own half-baked solutions.

Robie Basak (racb) wrote :

conserver is in Debian non-free, and thus unsuitable.

conserver would still needed to receive the log output from qemu via a FIFO or similar, and this introduces the problem of what qemu should do when it is blocked on writing to conserver, which is where I think my previous patch failed (and I did warn about this in advance!). And conserver only supports log rotation, rather than a ringbuffer. This would make get_console_output useless if it was requested just after the log rotated, without extra complex glue. The code to work around these issues would be more complex than just not using conserver, IMHO.

In any case, conserver is massive overkill for the need here, which is to have a simple ringbuffered console log.

Daniel Berrange (berrange) wrote :

IMHO having fixed size rotated logs per VM with max number of files, is a better solution that a ringbuffer. It really doesn't complicate the code that much to have to potentially just read a few lines from a second rotated logfile.

While I agree that conserver is overkill if satisfying the requirements of the get_console_output() API contract is all that's required, I am thinking of the bigger picture, improving the console functionality available for the libvirt Nova driver in general.

Sam Morrison (sorrison) wrote :

We recently had a console log grow to 5.5GB

When a user tries to get the console via the API it in turn packs this into a message and sends it off to rabbitmq.
For us this completely killed our rabbitmq cluster. The user kept trying to get the console knocking out a rabbitmq server each time (each rabbit has 4GB of ram)

To me this seems like nova could do something here like not try and send a 5.5GB message through rabbit?

Michael Still (mikal) wrote :

It certainly seems like we should only send the last N lines of the console to the user (although that might be computationally expensive to generate on such a large file). That's a separate bug though I suspect. I've filed bug 1081436 for that.

Thierry Carrez (ttx) wrote :

A more permanent solution needs to be discussed for this. Mikal wants to have a session about it at the next Summit.

Changed in nova:
assignee: Chuck Short (zulcss) → nobody
importance: Medium → High
status: In Progress → Confirmed

Can we please move this to wish list?

Joshua Harlow (harlowja) wrote :

Just to brainstorm:

Are any of the following possible??

1. Correctly have libvirtd configure and manage the console log file size. Some new XML configuration for the domain.xml format could be provided to alter the behavior? How much of the libvirt code would have to change for this?
2. If #1 isn't possible, then I presume we could use libvirt's 'named pipe' console capability and attach our own daemon to do the same as #1 (where we would use said daemon to read on said pipe and restrict the output files size). Of course how does this handle restarts or daemon failures. Basically this daemon would be the ring buffer 'maintainer' while libvirt would just feed it info.
3. Use a pseudo-tty to do something similar to #2?

@Michael, be interested to hear what u think. Personally I think the libvirtd project is where this belongs (via #1) since it already has a daemon (libvirtd) and knowledge about which instances are active, console config, and such. I assume this hasn't been fixed in libvirt (or as a feature request). I'm not an expert on that code-base but if I had pointers perhaps that is the correct way to go about this (or its a dual-approach).

Yaguang Tang (heut2008) wrote :

should we first fix it in nova before kvm and libvirt have better fix on this ?

Michael Still (mikal) on 2013-07-29
Changed in nova:
assignee: nobody → Michael Still (mikalstill)

Fix proposed to branch: master
Review: https://review.openstack.org/39048

Changed in nova:
status: Confirmed → In Progress
Thierry Carrez (ttx) on 2013-09-05
Changed in nova:
milestone: none → havana-rc1
Michael Still (mikal) on 2013-09-20
Changed in nova:
status: In Progress → Triaged
assignee: Michael Still (mikalstill) → nobody
Changed in nova:
milestone: havana-rc1 → icehouse-1

Fix proposed to branch: master
Review: https://review.openstack.org/47634

Changed in nova:
assignee: nobody → Michael Still (mikalstill)
status: Triaged → In Progress
Michael Still (mikal) on 2013-09-20
Changed in nova:
assignee: Michael Still (mikalstill) → nobody
status: In Progress → Triaged
Thierry Carrez (ttx) wrote : target

 affects nova
 milestone icehouse-2

Changed in nova:
milestone: icehouse-1 → icehouse-2
Changed in nova:
milestone: icehouse-2 → icehouse-3
Changed in nova:
milestone: icehouse-3 → none
Thierry Carrez (ttx) wrote :

Putting back in OSSA scope so that we discuss what to do with this

Changed in ossa:
status: New → Incomplete
Robert Clark (robert-clark) wrote :

I've just read through this thread with a view to wether we should release a related OSSN.

From what I can tell, no fix was ever agreed on, is this likely to change?

Dave Walker (davewalker) wrote :

@robert-clark, the fixes to date have been band-aids. This is currently targeted to be fixed properly in Juno with https://blueprints.launchpad.net/nova/+spec/fix-libvirt-console-logging

Thierry Carrez (ttx) wrote :

@Rob: if we can document relatively-efficient workarounds, yes, that would make a good OSSN. The "fix" has been delayed for quite some releases now, so I'd not hold my breath for juno :)

hzxiongwenwu (xwwzzy) on 2014-06-18
Changed in ossa:
assignee: nobody → hzxiongwenwu (xwwzzy)
John Haller (john-haller) wrote :

See the following blueprint, the associated code has passed the gate for Juno:
https://blueprints.launchpad.net/nova/+spec/serial-ports

This only addresses KVM hosts, which support access console access via SPICE, which is the solution adopted in above blueprint.

@John Haller: Great news!

Do you think the associated code could be proposed as backport for Havana and Icehouse too ?

Yaguang Tang (heut2008) wrote :

I think the blueprint https://blueprints.launchpad.net/nova/+spec/serial-ports is just a workaround for this bug, and currently we have no way to disable console.log .

Kaya LIU (kayaliu) on 2014-11-07
tags: added: cts openstack
Tom Fifield (fifieldt) on 2014-11-24
tags: added: ops
Tony Breeds (o-tony) on 2014-12-04
Changed in nova:
assignee: nobody → Tony Breeds (o-tony)
Jeremy Stanley (fungi) on 2014-12-06
Changed in ossa:
assignee: hzxiongwenwu (xwwzzy) → nobody
Tony Breeds (o-tony) on 2014-12-06
Changed in qemu-kvm (Ubuntu):
assignee: nobody → Tony Breeds (o-tony)
Thierry Carrez (ttx) wrote :

I think it's legitimate to consider that the flaw is in qemu or libvirt, for copying console data from guest to host without much possibilities of controlling it. My recommendation would be to fix it there so that all cases of hostile VMs are covered, rather than just the Nova use case. If we agree that is the right way to go, I would close the OSSA as wontfix.

Jeremy Stanley (fungi) wrote :

Agreed, this is class C2 (a vulnerability in some dependency, not in OpenStack code, and so nothing we're going to fix with a patch to OpenStack security supported projects nor anything for which we should issue a security advisory). If there are no disagreements, I'll switch this to a regular public bug and mark the security advisory task "won't fix" on Thursday.

Jeremy Stanley (fungi) wrote :

It's now (UTC) Thursday.

Changed in ossa:
status: Incomplete → Won't Fix
tags: added: security
information type: Public Security → Public
Lei Li (matrixs-zero) wrote :

Re comment #44:

There has been a ring buffer char device named ringbuf available upstream in QEMU as Anthony suggested in comment #23 like following:

qemu -chardev ringbuf,size=640k,id=foo -serial chardev:foo

And the QMP commands have already been exposed by QEMU like:

{ 'command': 'ringbuf-read', 'arguments': "arguments": { "device": "foo", "size": 1000, "format": "utf8" } }

Actually it was just the implementation of this request:

http://comments.gmane.org/gmane.comp.emulators.qemu/190843

Sean Dague (sdague) wrote :

Long time bug, it's confirmed, not triaged, as the path forward remains unclear.

Changed in nova:
status: Triaged → Confirmed
sean mooney (sean-k-mooney) wrote :

this has been around a really long time now

is the aproch suggested here a suitable solution.
https://bugs.launchpad.net/charms/+source/nova-compute/+bug/1460197

really it is the installation tool change but prehaps we can do someting from the nova side also.
perhaps just document how to configure log rotate rotate the logs in the install guide?

Daniel Berrange (berrange) wrote :

Patches are ready to solve this entirely in the libvirt layer one & for all. It'll be fixed with libvirt 1.3.3

https://www.redhat.com/archives/libvir-list/2016-February/msg01449.html

sean mooney (sean-k-mooney) wrote :

ah that is good to hear.
i assume this will be fixed then before the newton release.

what is the time frame of libvirt 1.3.3 and qemu 2.6?

Daniel Berrange (berrange) wrote :

Libvirt releases once a month, and QEMU is in feature freeze for its next release. So this will easily be ready before Newton

Blueprint "libvirt-virtlogd" [1] intents to make use of the libvirt feature Daniel mentioned in comment #51.

[1] https://blueprints.launchpad.net/nova/+spec/libvirt-virtlogd

Changed in nova:
assignee: Tony Breeds (o-tony) → nobody
zkou (finishman1) on 2016-04-25
information type: Public → Public Security
information type: Public Security → Private Security
information type: Private Security → Public

CONFIRMED FOR: MITAKA

Changed in nova:
assignee: nobody → Markus Zoeller (markus_z) (mzoeller)
status: Confirmed → In Progress
Nazeema Begum (nazeema) on 2016-12-05
Changed in nova:
assignee: Markus Zoeller (markus_z) (mzoeller) → nazeema (nazeema)
Nazeema Begum (nazeema) on 2016-12-05
Changed in nova:
assignee: nazeema (nazeema) → nobody

Fix proposed to branch: master
Review: https://review.openstack.org/407450

Changed in nova:
assignee: nobody → Markus Zoeller (markus_z) (mzoeller)

Change abandoned by Markus Zoeller (markus_z) (<email address hidden>) on branch: master
Review: https://review.openstack.org/323765
Reason: Got superseded by https://review.openstack.org/#/c/407450/

Reviewed: https://review.openstack.org/407450
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1f659251c7509cab045024044a6b8d642ad85aef
Submitter: Jenkins
Branch: master

commit 1f659251c7509cab045024044a6b8d642ad85aef
Author: Markus Zoeller <email address hidden>
Date: Tue Dec 6 11:40:25 2016 +0100

    libvirt: virtlogd: use virtlogd for char devices

    This change makes actual usage of the "logd" sub-element for char devices.
    The two REST APIs ``os-getConsoleOutput`` and ``os-getSerialConsole`` can
    now be satisfied at the same time. This is valid for any combination of:
    * char device element: "console", "serial"
    * char device type: "tcp", "pty"
    There is also no need to create multiple different device types anymore.
    If we have a tcp device, we don't need the pty device anymore. The logging
    will be done in the tcp device.

    Implements blueprint libvirt-virtlogd
    Closes-Bug: 832507
    Change-Id: Ia412f55bd988f6e11cd78c4c5a50a86389e648b0

Changed in nova:
status: In Progress → Fix Released

This issue was fixed in the openstack/nova 15.0.0.0b2 development milestone.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments