I mentioned in the last discussion around this, that the one thing that could be done is to make this single thread mem-init a multi thread action (in the kernel). I doubt that we can make it omit the initialization. Even though it is faster, even the 1G Huge Page setup could be more efficient.
To consider that we need to know what exactly it does.
The last call from userspace before this long 1-thread-busy period starts are:
openat(AT_FDCWD, "/dev/vfio/vfio", O_RDWR|O_CLOEXEC) = 57
ioctl(57, VFIO_GET_API_VERSION, 0x80002) = 0
ioctl(57, VFIO_CHECK_EXTENSION, 0x1) = 1
ioctl(57, VFIO_CHECK_EXTENSION, 0x3) = 1
ioctl(56, VFIO_GROUP_SET_CONTAINER, 0x7fff4499dc24) = 0
ioctl(57, VFIO_SET_IOMMU, 0x3) = 0
ioctl(57, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO or VFIO_IOMMU_GET_INFO or VFIO_IOMMU_SPAPR_TCE_GET_INFO, 0x7fff4499dc30) = 0
ioctl(12, KVM_CREATE_DEVICE, 0x7fff4499dbd4) = 0
ioctl(58, KVM_SET_DEVICE_ATTR, 0x7fff4499dbe0) = 0
ioctl(57, VFIO_DEVICE_PCI_HOT_RESET or VFIO_IOMMU_MAP_DMA, 0x7fff4499da90) = 0
ioctl(57, VFIO_DEVICE_PCI_HOT_RESET or VFIO_IOMMU_MAP_DMA, 0x7fff4499da90) = 0
ioctl(57, VFIO_DEVICE_PCI_HOT_RESET or VFIO_IOMMU_MAP_DMA, 0x7fff4499da90) = 0
ioctl(57, VFIO_DEVICE_PCI_HOT_RESET or VFIO_IOMMU_MAP_DMA, 0x7fff4499da90) = 0
ioctl(57, VFIO_DEVICE_PCI_HOT_RESET or VFIO_IOMMU_MAP_DMA, 0x7fff4499da90) = 0
Note: from our last thoughts on this it is quite possible that the granularity this could be split depends on the locks being used. E.g. per Node, as arbitrary chunks might just lock content and ping-pong those locks.
It might be similar to [1] that ended up in [2] after all. Not sure yet if in the kernel it does reset or map.
I mentioned in the last discussion around this, that the one thing that could be done is to make this single thread mem-init a multi thread action (in the kernel). I doubt that we can make it omit the initialization. Even though it is faster, even the 1G Huge Page setup could be more efficient.
To consider that we need to know what exactly it does. API_VERSION, 0x80002) = 0 EXTENSION, 0x1) = 1 EXTENSION, 0x3) = 1 SET_CONTAINER, 0x7fff4499dc24) = 0 GET_PCI_ HOT_RESET_ INFO or VFIO_IOMMU_GET_INFO or VFIO_IOMMU_ SPAPR_TCE_ GET_INFO, 0x7fff4499dc30) = 0 DEVICE_ ATTR, 0x7fff4499dbe0) = 0 PCI_HOT_ RESET or VFIO_IOMMU_MAP_DMA, 0x7fff4499da90) = 0 PCI_HOT_ RESET or VFIO_IOMMU_MAP_DMA, 0x7fff4499da90) = 0 PCI_HOT_ RESET or VFIO_IOMMU_MAP_DMA, 0x7fff4499da90) = 0 PCI_HOT_ RESET or VFIO_IOMMU_MAP_DMA, 0x7fff4499da90) = 0 PCI_HOT_ RESET or VFIO_IOMMU_MAP_DMA, 0x7fff4499da90) = 0
The last call from userspace before this long 1-thread-busy period starts are:
openat(AT_FDCWD, "/dev/vfio/vfio", O_RDWR|O_CLOEXEC) = 57
ioctl(57, VFIO_GET_
ioctl(57, VFIO_CHECK_
ioctl(57, VFIO_CHECK_
ioctl(56, VFIO_GROUP_
ioctl(57, VFIO_SET_IOMMU, 0x3) = 0
ioctl(57, VFIO_DEVICE_
ioctl(12, KVM_CREATE_DEVICE, 0x7fff4499dbd4) = 0
ioctl(58, KVM_SET_
ioctl(57, VFIO_DEVICE_
ioctl(57, VFIO_DEVICE_
ioctl(57, VFIO_DEVICE_
ioctl(57, VFIO_DEVICE_
ioctl(57, VFIO_DEVICE_
Note: from our last thoughts on this it is quite possible that the granularity this could be split depends on the locks being used. E.g. per Node, as arbitrary chunks might just lock content and ping-pong those locks.
It might be similar to [1] that ended up in [2] after all. Not sure yet if in the kernel it does reset or map.
[1]: https:/ /www.spinics. net/lists/ kvm/msg177206. html /git.kernel. org/pub/ scm/linux/ kernel/ git/torvalds/ linux.git/ commit/ ?id=e309df5b0c9 e67cc929eedd3e3 2f4907fa49543e
[2]: https:/