AMD impish Libvirt7.6 nested amd-v, rbd storage broken

Bug #1943729 reported by sascha arthur
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ceph (Ubuntu)
Expired
Undecided
Unassigned
libvirt (Ubuntu)
Expired
Undecided
Unassigned
linux (Ubuntu)
Expired
Undecided
Unassigned
qemu (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Hello,

Having issues with AMD and RBD Driver. Running on latest impish, using libvirt7.6ubuntu1.

Having following setup (not sure if matters):

Dedicated-Host (AMD) -> Level1-KVM (cpu-pass-through, rbd) -> "Customer-VM"=CVM

ubuntu20.04 -> impish -> custom

What happens?

It seems Level1-KVM is struggeling with RBD storage access, breaking/preventing CVM's operating system to fully boot. Half of the booting process is going through, but later at random point in time (still booting the kernel) it seems to deadlock, letting CVM's CPU spin endless.

Following tests setups i verified to isolate the issue:

Dedicated-Host (Intel) -> Level1-KVM (impish=7.6,cpu-pass-through, rbd=virtio) -> CVM = works

Dedicated-Host (AMD) -> Level1-KVM (impish=7.6,cpu-pass-through, direct storage=virtio) -> CVM = works

Dedicated-Host (Intel) -> Level1-KVM (focal=6.0,cpu-pass-through, rbd=virtio) -> CVM = works

Dedicated-Host (AMD) -> Level1-KVM (focal=6.0,cpu-pass-through, rbd=virtio) -> CVM = works

Dedicated-Host (AMD) -> Level1-KVM (focal=6.0,cpu-pass-through, direct storage=virtio) -> CVM = works

-----------------------------------------------

Dedicated-Host (AMD) -> Level1-KVM (impish=7.6,cpu-pass-through, rbd=virtio) -> CVM = stuck

Out of those tests, the issue seems to be IMO located in libvirt 7.6, sadly im not able to check it against 7.4 (which was shortly available for impish, but was already removed from the repo mirror..).

Any idears what could make an AMD passthrough break access of RBD, letting CVM stuck accessing storage?

sascha arthur (sarthur)
description: updated
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Sascha,
it does not ring a bell for me :-/
Thanks for all the cross checks already.

While I don't see yet why you expect this to be libvirt, I can help you to gain access to other components to check. Because with the few that we know so far it could as well be qemu (for the guest setup), kernel (for KVM bits there), ceph (for RBD handling).
I'm gonna add bug tasks for those until we know where we have to look at.

For any Ubuntu package you can always check the publishing history which gives you access to all former builds. They are not in the apt repository (you only find the latest build there), but they are still available.

So look at
https://launchpad.net/ubuntu/+source/libvirt/+publishinghistory
https://launchpad.net/ubuntu/+source/qemu/+publishinghistory
https://launchpad.net/ubuntu/+source/ceph/+publishinghistory
https://launchpad.net/ubuntu/+source/linux/+publishinghistory

And try to use former builds, some of them might force you also pull in other dependencies (like an older glibc which unfortunately might change .. a lot).

If that does not work - or even better if we have a more clear reason which component might have issues we can also rebuild these versions in today Impish in an PPA. That would eliminate e.g. the glibc pain that I'm afraid will hit you with just using the old builds.
If you have a clear - could I have this - i'll try to build it for you, but first I'd ask to give these publishing history builds a chance.

Changed in qemu (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu):
status: New → Incomplete
Changed in libvirt (Ubuntu):
status: New → Incomplete
Changed in ceph (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for ceph (Ubuntu) because there has been no activity for 60 days.]

Changed in ceph (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for libvirt (Ubuntu) because there has been no activity for 60 days.]

Changed in libvirt (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for qemu (Ubuntu) because there has been no activity for 60 days.]

Changed in qemu (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.