new default qemu TCG sizes exceed common CI setups

Bug #1887763 reported by Christian Ehrhardt 
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
qemu (Ubuntu)
Fix Released
Undecided
Christian Ehrhardt 

Bug Description

As I reported on
  https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg05001.html

TL;DR:
common Host size in CI's is 1-2G - for example Ubuntu Autopkgtests 1.5G
Some of them run Guests of 0.5-1G size in TCG mode (as they often can't rely on having KVM available).

Due to https://git.qemu.org/?p=qemu.git;a=commit;h=600e17b261555c56a048781b8dd5ba3985650013 in qmeu 5.0 the TB default size bumped from 32M to 1G.

The 1G TB buffer + 0.5G actual guest size + there is no dynamic downsizing on memory pressure (never was) makes these systems go OOM-Killing the qemu process.

Let us try to tune back the default size a bit again while the upstream discussion is ongoing.

Currently affects in Ubuntu Groovy autopkgtest environment:
- casper
- open-iscsi
- systemd
- ubuntu-image

Related branches

Changed in qemu (Ubuntu):
status: New → In Progress
assignee: nobody → Christian Ehrhardt  (paelzer)
tags: added: update-excuse
description: updated
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

I'm currently following this due to on-going nettle transitions being blocked by current v5 (removed from archive so migrations could happen).

Latest upstream status:

https://lists.nongnu.org/archive/html/qemu-devel/2020-07/msg05547.html

On-going discussions if monitoring available physical memory is a good metric to decide TCG available memory, due to possible capping from containers/cgroups.

I'm retrying systemd tests (autopkgtests failed for previous qemu version) to help in finishing the migrations.. and will wait a bit more time to see if they come up into a final conclusion.

I could tackle this with a temporary patch using patch from Alex, or even the similar approach - sysconf(_SC_PHYS_PAGES) - proposed by Christian... but I would be only allowing v5 not to block other packages in migration without a final solution...

If needed I can do it, but let's wait a bit longer for upstream for now...

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Discussion on TCg fixes completed, the pull request hit the ML yesterday.
My former feedback is included - I don't want to wait any longer and re-prep a 5.0 upload again.

This will also include fixes for:
CVE-2020-12829 CVE-2020-13253 CVE-2020-13361 CVE-2020-13362 CVE-2020-13659 CVE-2020-13754 CVE-2020-10761 CVE-2020-13791 CVE-2020-13800 CVE-2020-15863 CVE-2020-144

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (17.5 KiB)

This bug was fixed in the package qemu - 1:5.0-5ubuntu3

---------------
qemu (1:5.0-5ubuntu3) groovy; urgency=medium

  * d/p/ubuntu/lp-1887763-*: fix TCG sizing that OOMed many small CI
    environments (LP: #1887763)
  * Pick further changes for groovy from debian/master since 5.0-5
    - ati-vga-check-mm_index-before-recursive-call-CVE-2020-13800.patch
      Closes: CVE-2020-13800, ati-vga allows guest OS users to trigger
      infinite recursion via a crafted mm_index value during
      ati_mm_read or ati_mm_write call.
    - revert-memory-accept-mismatching-sizes-in-memory_region_access_valid...patch
      Closes: CVE-2020-13754, possible OOB memory accesses in a bunch of qemu
      devices which uses min_access_size and max_access_size Memory API fields.
      Also closes: CVE-2020-13791
    - exec-set-map-length-to-zero-when-returning-NULL-CVE-2020-13659.patch
      CVE-2020-13659: address_space_map in exec.c can trigger
      a NULL pointer dereference related to BounceBuffer
    - megasas-use-unsigned-type-for-reply_queue_head-and-check-index...patch
      Closes: #961887, CVE-2020-13362, megasas_lookup_frame in hw/scsi/megasas.c
      has an OOB read via a crafted reply_queue_head field from a guest OS user
    - megasas-use-unsigned-type-for-positive-numeric-fields.patch
      fix other possible cases like in CVE-2020-13362 (#961887)
    - megasas-fix-possible-out-of-bounds-array-access.patch
      Some tracepoints use a guest-controlled value as an index into the
      mfi_frame_desc[] array. Thus a malicious guest could cause a very low
      impact OOB errors here
    - nbd-server-avoid-long-error-message-assertions-CVE-2020-10761.patch
      Closes: CVE-2020-10761, An assertion failure issue in the QEMU NBD Server.
      This flaw occurs when an nbd-client sends a spec-compliant request that is
      near the boundary of maximum permitted request length. A remote nbd-client
      could use this flaw to crash the qemu-nbd server resulting in a DoS.
    - es1370-check-total-frame-count-against-current-frame-CVE-2020-13361.patch
      Closes: CVE-2020-13361, es1370_transfer_audio in hw/audio/es1370.c does not
      properly validate the frame count, which allows guest OS users to trigger
      an out-of-bounds access during an es1370_write() operation
    - a few patches from the stable series:
      - fix-tulip-breakage.patch
        The tulip network driver in a qemu-system-hppa emulation is broken in
        the sense that bigger network packages aren't received any longer and
        thus even running e.g. "apt update" inside the VM fails. Fix this.
      - 9p-lock-directory-streams-with-a-CoMutex.patch
        Prevent deadlocks in 9pfs readdir code
      - net-do-not-include-a-newline-in-the-id-of-nic-device.patch
        Fix newline accidentally sneaked into id string of a nic
      - qemu-nbd-close-inherited-stderr.patch
      - virtio-balloon-fix-free-page-hinting-check-on-unreal.patch
      - virtio-balloon-fix-free-page-hinting-without-an-iothread.patch
      - virtio-balloon-unref-the-iothread-when-unrealizing.patch
    - acpi-tmr-allow-2-byte-reads.patch (Closes: #964247)
    - reapply CVE-2020-13253 fixed from upstre...

Changed in qemu (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.