virtiofs guest connection refused after upgrade qemu-system-x86:amd64 (1:6.2+dfsg-2ubuntu6.12, 1:6.2+dfsg-2ubuntu6.13)

Bug #2033957 reported by Steffen McPrivacy
66
This bug affects 10 people
Affects Status Importance Assigned to Milestone
qemu (Ubuntu)
Invalid
Undecided
Unassigned
Jammy
Fix Released
High
Sergio Durigan Junior

Bug Description

[ Impact ]

QEMU users who rely on virtiofs for mounting external paths will face a connection refused error while trying to accessing such mountpoints. There is no workaround for this problem other than using a previous version of QEMU.

[ Test Plan ]

We'll need to create a VM on libvirt and edit its XML domain definition in order to make it use virtiofsd to access a path on the host.

Inside a Jammy system:

$ sudo apt install -y libvirt-daemon-system uvtool-libvirt
$ uvt-simplestreams-libvirt sync release=jammy arch=amd64
$ uvt-kvm create j release=jammy --memory 1024
$ virsh destroy j
$ virsh edit j

Inside the editor, add the following snippets:

<domain type='kvm'>
  ...
  <memoryBacking>
    <source type='memfd'/>
    <access mode='shared'/>
  </memoryBacking>
  ...
  <devices>
    ...
    <filesystem type='mount' accessmode='passthrough'>
      <driver type='virtiofs' queue='1024'/>
      <binary path='/usr/lib/qemu/virtiofsd' xattr='on'>
        <cache mode='always'/>
        <lock posix='on' flock='on'/>
      </binary>
      <source dir='/tmp/test'/>
      <target dir='mytag'/>
    </filesystem>
    ...
  </devices>
</domain>

Save and exit.

$ mkdir -p /tmp/test
$ touch /tmp/test/a
$ virsh start j
$ virsh wait j
$ virsh ssh j

Now, while inside the VM (as the ubuntu user):

$ sudo mount -t virtiofs mytag /mnt
$ ls -la /mnt

You should see the file 'a'.

[ Where problems could occur ]

This bug is a regression caused by one of the patches backported to address bug bug #1853307. The problem happens because the Linux kernel headers have been updated, and that caused a change in the size of "struct fuse_init_in" that wasn't accounted for. Upstream's fix for this was to initially limit the function that reads such struct in a way that only the 16 initial bytes are considered. This fix, albeit correct in theory, wasn't part of any release because https://gitlab.com/qemu-project/qemu/-/commit/242f2cae782d433d69d195e14564b6437ec9f7e6 was merged right after which implemented new features for virtiofsd including the support for the extra bytes coming from "struct fuse_init_in". I chose not to backport any of the commits that are part of the merge aforementioned exactly because they fall under the category "new features", which is not acceptable for our SRUs.

A regression, if it were to occur, would likely manifest in the form of problems with the code responsible for parsing the first 16 bytes of the struct. I'm not entirely sure whether the struct can be affected by compiler optimizations which add paddings to speed better align the struct fields; it doesn't that it is.

Note that a regression here would only affect virtiofsd users, who are already completely unable to use the feature.

[ Original Description ]

After installing the upgrades
qemu-system-x86:amd64 (1:6.2+dfsg-2ubuntu6.12, 1:6.2+dfsg-2ubuntu6.13),
qemu-user-static:amd64 (1:6.2+dfsg-2ubuntu6.12, 1:6.2+dfsg-2ubuntu6.13),
qemu-utils:amd64 (1:6.2+dfsg-2ubuntu6.12, 1:6.2+dfsg-2ubuntu6.13),
qemu-system-common:amd64 (1:6.2+dfsg-2ubuntu6.12, 1:6.2+dfsg-2ubuntu6.13),
qemu-block-extra:amd64 (1:6.2+dfsg-2ubuntu6.12, 1:6.2+dfsg-2ubuntu6.13),
qemu-system-data:amd64 (1:6.2+dfsg-2ubuntu6.12, 1:6.2+dfsg-2ubuntu6.13),
qemu-system-gui:amd64 (1:6.2+dfsg-2ubuntu6.12, 1:6.2+dfsg-2ubuntu6.13)

and a reboot of the server, all virtual machines are getting an "connection refused" error and cannot access the host folders via virtiofs anymore.

Exact Error message on the client:
cannot access '/var/www-lib': Connection refused

Nothing else in the Logs.
Client OS:
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy
Linux 5.15.0-82-generic #91-Ubuntu SMP
FileSystem: EXT4

Server OS:
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy
Ubuntu 22.04, 5.15.0-82-generic #91-Ubuntu SMP Mon Aug 14 14:14:14 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
FileSystem: EXT4

I scanned all logs, in the kern.log, I found the following message:

virtiofs virtio0: virtio_fs_setup_dax: No cache capability

the XML-Code from the guest machine looks like this:

...
  <memoryBacking>
    <hugepages>
      <page size='2' unit='M'/>
    </hugepages>
    <access mode='shared'/>
  </memoryBacking>
....
    <filesystem type='mount' accessmode='passthrough'>
      <driver type='virtiofs' queue='1024'/>
      <binary path='/usr/lib/qemu/virtiofsd' xattr='on'>
        <cache mode='always'/>
        <lock posix='on' flock='on'/>
      </binary>
      <source dir='/Da/n/W/w'/>
      <target dir='W'/>1
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </filesystem>
...

I did a little more trying to find the issue. The change in the file /usr/lib/qemu/virtiofsd with the update 1:6.2+dfsg-2ubuntu6.13 is causing the problem.
I did an upgrade to the new rust based virtiofsd and modified my virtual machine to be loaded without flock and posix on.
Voila, mapping is working.
Now I changed it back to the original version - access denied
flock and posix still not configured, changing back to version 1:6.2+dfsg-2ubuntu6.12 is also working.

Therefore I assume we have an bug in the new virtiofsd version.

Related branches

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in qemu (Ubuntu):
status: New → Confirmed
Revision history for this message
JohnJay (johnjay) wrote :

Windows 11 guests fail too. The virtiofssvc service fails to start, giving the error "The parameter is incorrect".

Revision history for this message
Solb (solb2) wrote :

Also experiencing this bug, is it possible to downgrade to fix it?

Revision history for this message
JohnJay (johnjay) wrote :

@oursin Yes. I found that 1:6.2+dfsg-2ubuntu6.12 wasn't in the repos I was using, but 1:6.2+dfsg-2ubuntu6.11 was. I downgraded all of the listed packages, but I don't know if that was necessary or not.

Revision history for this message
Steffen McPrivacy (steffenmp) wrote :

It is not necessary to downgrade all packages.
What I did at the end was to exchanged the virtiofsd manually.
You can do this two ways, first and mostly better solution with an old version from the Ubuntu packages.
Second, download the new RUST based version from github. Second option has the requirement to update your guests as the parameters "flock and posix" are no longer required and supported.

Revision history for this message
isparnid (isparnid) wrote :

Hello,
I downgraded to version 1:6.2+dfsg-2ubuntu6.11 and blocked updates to get back to normal operation. Thanks a lot.

Revision history for this message
Trebacz (david-trebacz) wrote :

I tried to downgrade using "sudo apt install qemu=1:6.2+dfsg-2ubuntu6.11" It succeeded, but even after a restart :

Get:1 http://us.archive.ubuntu.com/ubuntu jammy-security/universe amd64 qemu amd64 1:6.2+dfsg-2ubuntu6.11 [14.3 kB]
Fetched 14.3 kB in 2s (9,107 B/s)
dpkg: warning: downgrading qemu from 1:6.2+dfsg-2ubuntu6.13 to 1:6.2+dfsg-2ubuntu6.11
(Reading database ... 151954 files and directories currently installed.)
Preparing to unpack .../qemu_1%3a6.2+dfsg-2ubuntu6.11_amd64.deb ...
Unpacking qemu (1:6.2+dfsg-2ubuntu6.11) over (1:6.2+dfsg-2ubuntu6.13) ...
Setting up qemu (1:6.2+dfsg-2ubuntu6.11) ...
Scanning processes...
Scanning candidates...
Scanning processor microcode...
Scanning linux images...
Running kernel seems to be up-to-date.
The processor microcode seems to be up-to-date.

Restarting services...
 systemctl restart wsdd.service

No containers need to be restarted.
No user sessions are running outdated binaries.
No VM guests are running outdated hypervisor (qemu) binaries on this host.

Even after a restart of the host I still get the error:
sudo ls -hal /mnt/
ls: cannot access '/mnt/SSD': Connection refused
total 16K
drwxr-xr-x 5 root root 4.0K Sep 3 16:47 .
drwxr-xr-x 19 root root 4.0K Sep 4 11:00 ..
d????????? ? ? ? ? ? SSD

Revision history for this message
JohnJay (johnjay) wrote :

@david-trebacz I downgraded the individual packages that had been upgraded rather than the metapackage

Revision history for this message
Solb (solb2) wrote :

Im a bit confused about which package to uninstall. Tried "sudo apt install qemu=1:6.2+dfsg-2ubuntu6.11" with restart first. Then downgradring from the list in the post but not sure which package to downgrade and dependencies made this confusing. ended up with upgrading them all to get the vms started again.

Revision history for this message
Steffen McPrivacy (steffenmp) wrote :

@Trebacz, after the downgrade, please ensure, that the host did a restart or (quite dirty way) kill all virtualfsd instances and reboot the VMs. For me it sounds like the file virtualfsd was not restarted or downgraded correct

Revision history for this message
Steffen McPrivacy (steffenmp) wrote :

@Solb, you need to downgrade the while list of packages:
qemu-system-x86:amd64
qemu-user-static:amd64
qemu-utils:amd64
qemu-system-common:amd64
qemu-block-extra:amd64
qemu-system-data:amd64
qemu-system-gui:amd64

to ensure the dependencies of the packages.
Then reboot the host and everything should be fine for now.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

This is very interesting and thank you all for your reports and updates.
I wonder as https://launchpad.net/ubuntu/+source/qemu/1:6.2+dfsg-2ubuntu6.13 is only about s390x changes and should (tm) have no change whatsoever in regard to x86 virtiofsd.

Assigning Sergio who was driving the former upload to have a look himself.

Changed in qemu (Ubuntu):
assignee: nobody → Sergio Durigan Junior (sergiodj)
Revision history for this message
Solb (solb2) wrote :

Thanks @steffenmp. Had to do a restart after qemu-system-common downgrade and also downgrade dependence qemu-system-x86-xen. working fine now after another restart.

Revision history for this message
Steffen McPrivacy (steffenmp) wrote :

@Christian thank you for the assignment of Sergio. I also read the change notes in launchpad but as you can see, the effect is reproduceable with the new version. I am happy to support testing with an updated version.

@Solb, it was a pleasure

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Thanks for the bug report and initial investigation.

I could reproduce the problem here. Let me see if I can find what's wrong.

Changed in qemu (Ubuntu):
importance: Undecided → High
status: Confirmed → Triaged
Changed in qemu (Ubuntu Jammy):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Sergio Durigan Junior (sergiodj)
Changed in qemu (Ubuntu):
importance: High → Undecided
assignee: Sergio Durigan Junior (sergiodj) → nobody
status: Triaged → Invalid
tags: added: server-todo
Revision history for this message
Trebacz (david-trebacz) wrote :

I can confirm after the successful downgrade (1:6.2+dfsg-2ubuntu6.13 -> 1:6.2+dfsg-2ubuntu6.11) and restart of the host, the virtiofsd filesystems appear in the VM's as expected.

apt install qemu-system-x86:amd64=1:6.2+dfsg-2ubuntu6.11
apt install qemu-user-static:amd64=1:6.2+dfsg-2ubuntu6.11
apt install qemu-utils:amd64=1:6.2+dfsg-2ubuntu6.11
apt install qemu-system-common:amd64=1:6.2+dfsg-2ubuntu6.11
apt install qemu-block-extra:amd64=1:6.2+dfsg-2ubuntu6.11
apt install qemu-system-data:amd64=1:6.2+dfsg-2ubuntu6.11
apt install qemu-system-x86:amd64=1:6.2+dfsg-2ubuntu6.11
apt install qemu-system-gui:amd64=1:6.2+dfsg-2ubuntu6.11

Probably a more efficient way to downgrade, but it seems to have "worked".

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

I think I found the problem.

I'm building qemu in a PPA right now. Could you please verify that this build fixes the issue for you?

https://launchpad.net/~sergiodj/+archive/ubuntu/qemu-bug2033957

Thanks.

description: updated
Revision history for this message
Dirk Wilmer (bluescr) wrote :

I tried Sergio's solution on Win10 and Win11 guests. Working for me.

Revision history for this message
Steffen McPrivacy (steffenmp) wrote :

I can confirm the solution is working with Ubuntu guests as well

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Thank you for confirming that the patch solves the issue.

I'll proceed with the SRU, then.

Revision history for this message
Trebacz (david-trebacz) wrote (last edit ):
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote (last edit ):

Thanks. I didn't have the time to work on this bug today, but I'll finish the SRU text and upload the package tomorrow.

description: updated
Changed in qemu (Ubuntu Jammy):
status: Triaged → In Progress
tags: added: regression-update
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello Steffen, or anyone else affected,

Accepted qemu into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/qemu/1:6.2+dfsg-2ubuntu6.14 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in qemu (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Mirko Meschini (mirkomeschini) wrote :

Me too on this problem, that's the reason why I rarely update :) VMs that use virtiofs mount didn't work anymore, connection refused if I cd to the mounted fs.

I manually installed 1:6.2+dfsg-2ubuntu6.11 version of:

qemu-system-x86
qemu-system-common
qemu-system-misc
qemu-system-gui
qemu-system-data
qemu-block-extra
qemu-utils

Now all is working again, and marked hold all these packages.

Revision history for this message
Steffen McPrivacy (steffenmp) wrote :

Hello Timo,

thank you for the update.
I will test the package when it is available.

Best regards,

Steffen

Revision history for this message
Steffen McPrivacy (steffenmp) wrote (last edit ):

Hi Timo,

verification-done-jammy

Verification Steps:
- Installation of packages:
  qemu-system-x86:amd64 1:6.2+dfsg-2ubuntu6.14)
  qemu-user-static:amd64 (1:6.2+dfsg-2ubuntu6.14)
  qemu-utils:amd64 (1:6.2+dfsg-2ubuntu6.14)
  qemu-system-common:amd64 (1:6.2+dfsg-2ubuntu6.14)
  qemu-block-extra:amd64 (1:6.2+dfsg-2ubuntu6.14)
  qemu-system-data:amd64 (1:6.2+dfsg-2ubuntu6.14)
  qemu-system-gui:amd64 (1:6.2+dfsg-2ubuntu6.14)

- Reboot host and guest

- Testing folder access from 4 guest to the host directories mounted via virtualfsd - SUCCESSFUL
- Reading files from 4 guests with overall size of 50 GByte from the virtualfsd mounted directories - SUCCESSFUL
- Creating test folders from each of the 4 guests to the virtualfsd mounted directories - SUCCESSFUL
- Writting random data to 10 files for each of the 4 guests with overall size of 50 GByte to the virtualfsd mounted directories - SUCCESSFUL

For me, all guests are working fine with the patch

Best regards,

Steffen

tags: added: verification-done-jammy
tags: removed: verification-done-jammy verification-needed-jammy
tags: added: verification-done-jammy
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (qemu/1:6.2+dfsg-2ubuntu6.14)

All autopkgtests for the newly accepted qemu (1:6.2+dfsg-2ubuntu6.14) for jammy have finished running.
The following regressions have been reported in tests triggered by the package:

ganeti/3.0.2-1ubuntu1 (armhf)
livecd-rootfs/2.765.24 (s390x)
systemd/249.11-0ubuntu3.10 (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/jammy/update_excuses.html#qemu

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Steffen McPrivacy (steffenmp) wrote :

Hi Timo,

I had to build another guest, verification-done-jammy still working for me.

Verification Steps:
- Installation of packages (done 2023-09-15):
  qemu-system-x86:amd64 1:6.2+dfsg-2ubuntu6.14)
  qemu-user-static:amd64 (1:6.2+dfsg-2ubuntu6.14)
  qemu-utils:amd64 (1:6.2+dfsg-2ubuntu6.14)
  qemu-system-common:amd64 (1:6.2+dfsg-2ubuntu6.14)
  qemu-block-extra:amd64 (1:6.2+dfsg-2ubuntu6.14)
  qemu-system-data:amd64 (1:6.2+dfsg-2ubuntu6.14)
  qemu-system-gui:amd64 (1:6.2+dfsg-2ubuntu6.14)

- Reboot host and guest (done 2023-09-15)

- Creating new guest S-Bac-09 and including the following XML-Code:
...
  <memoryBacking>
    <hugepages>
      <page size='2' unit='M'/>
    </hugepages>
    <access mode='shared'/>
  </memoryBacking>
....
    <filesystem type='mount' accessmode='passthrough'>
      <driver type='virtiofs' queue='1024'/>
      <binary path='/usr/lib/qemu/virtiofsd' xattr='on'>
        <cache mode='always'/>
        <lock posix='on' flock='on'/>
      </binary>
      <source dir='/Da/n/W/w'/>
      <target dir='W'/>1
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </filesystem>
...

- Starting guest
- include mount parameter in fstab
- mount /a
- Access test to virtfolder - SUCCESSFUL
- guest reboot
- Access test to virtfolder - SUCCESSFUL

For me, all patch is working fine

Best regards,

Steffen

Revision history for this message
Christian Ehrhardt  (paelzer) wrote (last edit ):

FYI: The reported autopkgtest regressions in comment #28 have been flaky tests which are all resolved by now.

tags: added: verification-done
removed: verification-needed
Revision history for this message
tHe-BiNk (thebinkonline) wrote :

Noob here. I have 2 VMs that have the "refused connection" issue.

How do I install these updated packages, which are suggested, before they go "live?" Thanks in advance.

Revision history for this message
JohnJay (johnjay) wrote :

@thebinkonline

Easiest is to add this PPA and then apt upgrade: https://launchpad.net/~sergiodj/+archive/ubuntu/qemu-bug2033957

Revision history for this message
tHe-BiNk (thebinkonline) wrote :

@JohnJay Thank you. Worked like a sharm!

Revision history for this message
Brian Turek (brian-turek) wrote :

Just to add a +1 that the packages in jammy-proposed fixed the issue for me. I installed:

qemu-block-extra 1:6.2+dfsg-2ubuntu6.14
qemu-system-common 1:6.2+dfsg-2ubuntu6.14
qemu-system-data 1:6.2+dfsg-2ubuntu6.14
qemu-system-gui 1:6.2+dfsg-2ubuntu6.14
qemu-system-x86 1:6.2+dfsg-2ubuntu6.14
qemu-utils 1:6.2+dfsg-2ubuntu6.14

After a reboot, my VMs came up with my virtiofs mounts successfully mounted.

As a side note, the updated/fixed qemu-system-data didn't install automatically when selecting qemu-system-x86 whereas all of the other dependencies did. Looking at the dependencies of qemu-system-x86, most of them are pinned to a matching version but qemu-system-data is a ">>".

Revision history for this message
David Hedlund (g-public) wrote (last edit ):

A simple way to solve this is to temporarily upgrade to Linux 6.2:
* Ubuntu 22.04 Virtual Machine: sudo apt install linux-generic-hwe-22.04
* Trisquel 11 Virtual Machine: sudo apt install linux-generic-hwe-11.0

Don't forget to reboot the Virtual Machine.

Revision history for this message
Steffen McPrivacy (steffenmp) wrote :

@David, the virtualfsd mounting issue is independent from the Linux 6.2 kernel in my host and guest machines. It requires the fix from Sergio.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package qemu - 1:6.2+dfsg-2ubuntu6.14

---------------
qemu (1:6.2+dfsg-2ubuntu6.14) jammy; urgency=medium

  * d/u/lp-2033957-virtiofsd-Fix-breakage-due-to-fuse_init_in.patch:
    Fix virtiofsd breakage due to fuse_init_in size change, which
    happened because of the Linux kernel 5.17 headers that were
    imported in a previous patch. (LP: #2033957)

 -- Sergio Durigan Junior <email address hidden> Tue, 05 Sep 2023 22:58:36 -0400

Changed in qemu (Ubuntu Jammy):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for qemu has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Steffen McPrivacy (steffenmp) wrote :

A huge thank you for solving this issue in a really passionated and effective manner!

Revision history for this message
David Hedlund (g-public) wrote (last edit ):

Works for me now. Thanks a bunch!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.