uvt-simplestreams-libvirt in ubuntu_kvm_smoke_test will eat up all disk free space on Jammy AWS baremetal

Bug #1970896 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
New
Undecided
Unassigned
uvtool (Ubuntu)
Incomplete
Wishlist
Unassigned

Bug Description

Issue found with Jammy 5.15.0-28.29 with AWS bare-metal instances:
  * a1.metal
  * i3.metal
  * r5.metal

Test failed while trying to fetch the image:
  uvt-simplestreams-libvirt sync --source http://cloud-images.ubuntu.com/daily release=jammy arch=arm64

Error message:
  libvirt.libvirtError: Unable to write /var/lib/uvtool/libvirt/images/x-uvt-b64-Y29tLnVidW50dS5jbG91ZC5kYWlseTpzZXJ2ZXI6MjIuMDQ6YXJtNjQgMjAyMjA0MjM=: No space left on device

By monitoring the df -h command, you can see the free disk space on / has dropped from:
  /dev/nvme0n1p1 7.6G 4.6G 3.0G 61% /
To:
  /dev/nvme0n1p1 7.6G 7.0G 626M 92% /
As this task was not completed, `uvt-simplestreams-libvirt query` returns nothing.

This does not seems to be an issue to bare-metal nodes in our lab (maybe it's because they have a bigger disk). Look back into history, I can see this failure with:
  * 5.15.0-27.28 - a1.metal only
  * 5.15.0-25.25 - a1.metal, r5.metal (i5.metal failed with provision issue)
  * 5.15.0-23.23 - a1.metal, i3.metal, r5.metal

The amd64 / arm64 image on http://cloud-images.ubuntu.com/daily/server/jammy/current/ are just around 600M, I don't know why it will take up so much disk space.

Po-Hsu Lin (cypressyew)
tags: added: sru-20220418 ubuntu-kvm-smoke
tags: added: 5.15 aws jammy ubuntu-kvm-smoke-test
removed: ubuntu-kvm-smoke
Revision history for this message
Po-Hsu Lin (cypressyew) wrote (last edit ):

When you run this `uvt-simplestreams-libvirt sync` command, I noticed there will be two files created in /tmp:
-rw------- 1 root root 575M Apr 29 09:23 tmpf17tglw6
-rw------- 1 root root 1.4G Apr 29 09:23 tmpi4lq8p24
After that, it will start writing image file in /var/lib/uvtool/libvirt/images/, the file size reaches about 1.1G then disk space runs out
<email address hidden>:/var/lib/uvtool/libvirt/images# ls -alh
total 1.1G
-rw------- 1 root root 1.1G Apr 29 09:14 'x-uvt-b64-Y29tLnVidW50dS5jbG91ZC5kYWlseTpzZXJ2ZXI6MjIuMDQ6YXJtNjQgMjAyMjA0MjM='

Revision history for this message
Robie Basak (racb) wrote :

Hi,

In uvtool this is by design. It has to download the entire image before it can cryptographically verify it, and only then does it start injecting the image into libvirt. It does that using libvirt's socket API. Using that mechanism, I don't think it's possible to do a filesystem-level move to economize on disk usage.

I suppose it might be possible to do one of a couple of things:

1) Have some kind of "--insecure" flag and code that opportunistically skips the /tmp step in this case, streaming the download directly into libvirt. However this is more code to maintain for a very special case, and the error path cleanup at the libvirt of a failed download would also have to be checked and handled as needed.

2) Arrange to do a filesystem-level move instead of using libvirt's API. However this would break users who have more complicated setups, so there'd again have to be more code to maintain to special case the default case.

Since this behaviour is by design in uvtool, it isn't a bug per se, so I'll mark it accordingly. If you want to write and merge one of the above options, please could you explain why this is important for uvtool? Requiring twice the image size to be available in disk space seems reasonable to me.

Changed in uvtool (Ubuntu):
status: New → Incomplete
importance: Undecided → Wishlist
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Hi Robie,
thanks for the reply,

It make sense to require twice of the image size as per the verification requirement, however it looks like in this case it's using more than two times of it.

The image is just around 600M, so it should be just 600M in /tmp and 600M more in libvirt after it's been verified. 3.0G free space should be enough, but it's not.

Maybe I am missing some pieces here?
Thanks

Revision history for this message
Robie Basak (racb) wrote (last edit ):

Hi Po-Hsu Lin,

Sorry, there's also a decompression step. The relevant code is:

https://git.launchpad.net/uvtool/tree/uvtool/libvirt/__init__.py#n45

It's possible this could be optimized if libvirt's API has had any enhancements since I wrote that code. It looks like I did it this way because I had to tell libvirt the uncompressed size before creating the image in libvirt, and decompressing into a temporary file was the easiest way of determining this.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

OK!
Thanks for the info, we will discuss how to deal with this first (maybe just bump the disk size)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.