[Feature request] qemu-img multi-threaded compressed image conversion

Bug #601946 reported by Коренберг Марк
82
This bug affects 16 people
Affects Status Importance Assigned to Milestone
QEMU
Expired
Wishlist
Unassigned

Bug Description

Feature request:
qemu-img multi-threaded compressed image conversion

Suppose I want to convert raw image to compressed qcow2. Multi-threaded conversion will be much faster, because bottleneck is compressing data.

Revision history for this message
Jes Sorensen (jes-sorensen) wrote :

Hi,

The problem is that it is more than just the compression that is the problem, with modern cpus disk speed is a problem, and compression is often stream based. For now there isn't enough valid data that this qualifies as a bug/rfe.

If you decide to try and implement it, and provide data showing that this is actually a win, please reopen this.

Regards,
Jes

Revision history for this message
Коренберг Марк (socketpair) wrote :

1. during benchmark I used iotop and just top. qemu-img is eating all my cpu (3.07 Ghz) and disk streaming was at low speeds.
2. Writing on disk in ext4 is cached very strongly, so writing in 4 streams is not the problem.
3. For example, 7z give huge speed increase in when compressing in multiple threads.
4. Yes, i understand, that compressing is stream-based. So we can split input stream by chinks and compress each chunk individually.

You can use time qemu-img convert .... and see user/system/real timings. In my cases, user time is nearly equal real time, so CPU work in my case is the bottleneck.

Revision history for this message
Jan-Simon Möller (dl9pf) wrote :

There're also projects like http://compression.ca/pbzip2/ . We'll be facing more and more cores per cpu, so we should use these techniques.

Revision history for this message
Mike Ashton (akfypznt-rjp2-nw2wga2s) wrote :

The compression in this case is certainly chunked already, otherwise you couldn't implement a pseudo block device without reading the entire stream to read the last block! As the data in the new disk is necessarily chunk compressed, parallelisation is perfect feasible, it's just a question of the algorithm you use to arbitrate the work between the threads, which may need some thought as you'd likely be navigating a tree structure.

There's no question that Jes' suggestion would create a 12x speed up for me, and there's pretty standard off the shelf server hardware with 48 cores. As Jan-Simon Möller points out, being single-threaded and single-process isn't much of an option any more. If one is trying to compress, say, a 4TB virtual disk image then using a little over 2% of the available CPU time meaning you have to wait a week is going to be... frustrating :)

Revision history for this message
oernii (oernii) wrote :

I'd like to note, that I use qemu-img to backup snapshots of images. This works fine, it's just so slow. Of my 24 cores only 1 is used to compress the image.

It could be so much faster.

Revision history for this message
Bernhard M. Wiedemann (ubuntubmw) wrote :

qcow2_write_compressed in block/qcow2.c would need to be changed.
Currently it seems to need bigger changes as it always does compress+write for one block.
Not sure, how well it would handle multiple writes in parallel, so the safest would be to avoid that and just wait for the previous writer to finish before starting to write.

Thomas Huth (th-huth)
Changed in qemu:
importance: Undecided → Wishlist
Revision history for this message
Quentin Casasnovas (quentin.casasnovas) wrote :

It looks like qcow2_write_compressed() has been removed and turned into a qemu co-routine in qemu 2.8.0 (released in December 2017) to support live compressed back-ups. Any pointers to start working on this? We have servers with 128 CPUs and it's very sad to see them compress on a single CPU and take tens of minutes instead of a few seconds.. :)

Revision history for this message
Paolo Bonzini (bonzini) wrote :

The fact that it's now a coroutine_fn doesn't change much, if anything it makes it simpler to handle multiple writes in parallel.

Revision history for this message
Quentin Casasnovas (quentin.casasnovas) wrote :

That was also my feeling, so nice to get a confirmation!

Another related thing would be to allow qemu-nbd to write compressed blocks its backing image - today if you use a qcow2 with compression, any block which is written to gets uncompressed in the resulting image and you need to recompress the image offline with qemu-img.

Would you have any pointers/documentation on how best to implement this so both qemu-img and qemu-nbd can use multithreaded compressed writes ? I'm totally new to qemu block subsystem.

Revision history for this message
Коренберг Марк (socketpair) wrote :

@~quentin.casasnovas please report this as new feature request, instead of adding comment to this one.

Revision history for this message
Jinank Jain (jinankjain) wrote :

@~quentin.casasnovas Are you still working on this? If not then I would like to give this a shot?

Revision history for this message
Quentin Casasnovas (quentin.casasnovas) wrote :

@Jinank I have not started working on this at all, so please go ahead! Let me know if I can help with testing or anything, we make quite extensive use of nbd and qcow2 images internally.

Revision history for this message
Thomas Huth (th-huth) wrote :

The QEMU project is currently considering to move its bug tracking to
another system. For this we need to know which bugs are still valid
and which could be closed already. Thus we are setting older bugs to
"Incomplete" now.

If you still think this bug report here is valid, then please switch
the state back to "New" within the next 60 days, otherwise this report
will be marked as "Expired". Or please mark it as "Fix Released" if
the problem has been solved with a newer version of QEMU already.

Thank you and sorry for the inconvenience.

Changed in qemu:
status: New → Incomplete
Changed in qemu:
status: Incomplete → New
Revision history for this message
Thomas Huth (th-huth) wrote : Moved bug report

This is an automated cleanup. This bug report has been moved to QEMU's
new bug tracker on gitlab.com and thus gets marked as 'expired' now.
Please continue with the discussion here:

 https://gitlab.com/qemu-project/qemu/-/issues/80

Changed in qemu:
status: New → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.