[18.04 FEAT]: qemu: Enable RDMA support in qemu

Bug #1692476 reported by bugproxy on 2017-05-22
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Medium
Canonical Server Team
qemu (Ubuntu)
Medium
Christian Ehrhardt 

Bug Description

== Comment: #0 - SRIKANTH B. AITHAL <email address hidden> - 2016-09-22 06:10:33 ==
---Problem Description---
qemu does not support RDMA

Contact Information = <email address hidden>

---uname output---
Linux ltc-hab1 4.8.0-11-generic #12-Ubuntu SMP Sat Sep 17 19:58:16 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux

Machine Type = 8348-21C

---Debugger---
A debugger is not configured

---Steps to Reproduce---
 Attempting RDMA migration throws error saying its unknown migration protocol

root@ltc-hab1:/var/lib/libvirt/images/sharing# qemu-system-ppc64 \
> --nographic \
> --enable-kvm \
> -vga none \
> -machine pseries \
> -name migrate_qemu \
> -boot strict=on \
> -monitor telnet:127.0.0.1:1234,server,nowait \
> -device nec-usb-xhci,id=usb,bus=pci.0,addr=0xf \
> -device spapr-vscsi,id=scsi0,reg=0x2000 \
> -drive file=/var/lib/libvirt/images/sharing/ubuntu-16.10-ppc64le.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none \
> -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x2,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
> -m 2048 \
> -smp 8,cores=1,threads=4,maxcpus=12 \
> -incoming rdma:0:43000
qemu-system-ppc64: -incoming rdma:0:43000: unknown migration protocol: rdma:0:43000

root@ltc-hab1:/var/lib/libvirt/images/sharing# dpkg -l | grep -i qemu
ii ipxe-qemu 1.0.0+git-20150424.a25a16d-1ubuntu2 all PXE boot firmware - ROM images for qemu
ii qemu-block-extra:ppc64el 1:2.6.1+dfsg-0ubuntu3 ppc64el extra block backend modules for qemu-system and qemu-utils
ii qemu-kvm 1:2.6.1+dfsg-0ubuntu3 ppc64el QEMU Full virtualization
ii qemu-slof 20160223+dfsg-1 all Slimline Open Firmware -- QEMU PowerPC version
ii qemu-system-common 1:2.6.1+dfsg-0ubuntu3 ppc64el QEMU full system emulation binaries (common files)
ii qemu-system-ppc 1:2.6.1+dfsg-0ubuntu3 ppc64el QEMU full system emulation binaries (ppc)
ii qemu-utils 1:2.6.1+dfsg-0ubuntu3 ppc64el QEMU utilities

I see there is no support built for RDMA from the QEMU build log for 16.10 @ https://launchpadlibrarian.net/285349205/buildlog_ubuntu-yakkety-ppc64el.qemu_1%3A2.6.1+dfsg-0ubuntu3_BUILDING.txt.gz

RDMA support no

We need QEMU to be built with RDMA support

== Comment: #5 - Leonardo Augusto Guimaraes Garcia <email address hidden> - 2016-10-05 15:31:16 ==
Including Brian King here as I am not sure it will be possible to enable this feature in Ubuntu 17.04.

In order to build QEMU with RDMA support, we would need to have RDMA devel packages installed on the build machine. My understanding is that, for Ubuntu in general, IBM is going with MOFED drivers. However, in the environment Canonical is building QEMU, there is no MOFED, as MOFED is not a package supported by Canonical. If they build QEMU with RDMA support in their environment, it will probably not work with MOFED RDMA support. Could you please advise, Brian?

== Comment: #9 - James E. Sponaugle <email address hidden> - 2017-05-19 14:29:17 ==
Per discussion with Brian, moving this to 18.04.

bugproxy (bugproxy) on 2017-05-22
tags: added: architecture-ppc64le bugnameltc-146635 severity-medium targetmilestone-inin1804
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → qemu (Ubuntu)

Hi,
thank you for the info - from what I read I can only agree to:

"In order to build QEMU with RDMA support, we would need to have RDMA devel packages installed on the build machine. My understanding is that, for Ubuntu in general, IBM is going with MOFED drivers. However, in the environment Canonical is building QEMU, there is no MOFED, as MOFED is not a package supported by Canonical. If they build QEMU with RDMA support in their environment, it will probably not work with MOFED RDMA support."

I haven't checked what rdma dev lib your need in particular, but in general the assumption is correct we can and will only build against those packaged in the archive.
And that usually means what is published via http://www.openfabrics.org/

So what we can provide is (atm) libibverbs-dev at version 1.2.1-2ubuntu1 and librdmacm-dev at 1.1.0-2

Both are in main so no MIR will be needed to enable rdma.
Yet if you consider it useless since you want to drive it via MOFED there is no big reason to do so.

It is an ongoing issue of Openfabrics vs manufacturer libs, but if you think towards 18.04 you'll be good enabling and building this against the open libs (at that time) I still see no big reason why this should be Ubuntu only.
Especially since you target 18.04 (and not 17.10) timing should be ok. So I'd kindly ask you to open a bug with Debian to enable it and report the bug number there.

Changed in qemu (Ubuntu):
status: New → Incomplete
Frank Heimes (fheimes) on 2017-05-22
Changed in ubuntu-power-systems:
assignee: nobody → Canonical Server Team (canonical-server)
Manoj Iyer (manjo) on 2017-06-01
tags: added: ubuntu-18.04

------- Comment From <email address hidden> 2017-06-08 13:17 EDT-------
We discussed this with Mellanox. They indicated that since they provide verbs compatibility between upstream and MOFED, we should be able to build qemu with inbox rdma libs and, even if we load MOFED on top of this, we should not have issues.

tags: added: p9-virt-stack

Ping - Since 2.10 likely will be too late to make 17.10 polling for related commit ID's for p9 support - also this is maybe ~18.04 material.

bugproxy (bugproxy) on 2017-06-14
tags: removed: bugnameltc-146635 p9-virt-stack severity-medium ubuntu-18.04
Frank Heimes (fheimes) on 2017-06-14
Changed in ubuntu-power-systems:
status: New → Incomplete
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-06-14 10:32 EDT-------
(In reply to comment #15)
> Ping - Since 2.10 likely will be too late to make 17.10 polling for related
> commit ID's for p9 support - also this is maybe ~18.04 material.

This is Ubuntu 18.04 only right now.

tags: added: bugnameltc-146635 severity-medium
tags: added: p9-virt-stack-18.04
tags: added: ubuntu-18.04
removed: p9-virt-stack-18.04
Manoj Iyer (manjo) on 2017-07-19
Changed in qemu (Ubuntu):
importance: Undecided → Medium
Changed in ubuntu-power-systems:
importance: Undecided → Medium
Manoj Iyer (manjo) on 2017-07-24
tags: added: triage-g
Manoj Iyer (manjo) on 2017-09-25
Changed in qemu (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → David Britton (davidpbritton)
bugproxy (bugproxy) wrote :
Download full text (3.6 KiB)

------- Comment From <email address hidden> 2018-01-23 02:23 EDT-------
Have listed below, the configuration part and the list of test scenarios for the validation of this feature:

Configuration
================

Physical connections:

ROCE: ensure two ports of each ROCE cards are plugged into ROCE switch.

If there are no free ports available on switch you can get the connections directly i.e back to back

Below packages need to be installed for travis3EN/travisEN's ROCE tests.

#yum install libibverbs-utils.ppc64
#yum install librdmacm-utils.ppc64
#yum install libmlx4
#yum install librdmacm-utils
#yum install librdmacm
#yum install librdmacm-utils
#yum install libibverbs
#yum install infiniband-diags
#yum install opensm

Ensure the below drivers are loaded prior to tests.

# modprobe ib_uverbs
# modprobe ib_ucm
# modprobe ib_cm
# modprobe ib_mad
# modprobe ib_sa
# modprobe ib_umad
# modprobe ib_addr
# modprobe rdma_cm
# modprobe rdma_ucm
# modprobe ib_core
If using travis IB, the below additional driver needs to be loaded.
#modprobe ib_ipoib

Start the services:
#service opensm start

Test Scenarios:
===================

1. rping :
This covers RDMA_CM RC connections, but only userspace
(It establishes a set of reliable RDMA connections between two nodes using the librdmacm, optionally transfers data between the nodes, then disconnects).

2. Udaday:
This covers RDMA_CM UD connections.
(It establishes a set of unreliable RDMA datagram communication paths between two nodes using the librdmacm,
optionally transfers datagrams between the nodes, then tears down the communication)

3. rdma_server, rdma_client commands:
These commands are simple RDMA CM connection and ping-pong test
(It uses synchronous librdmam calls to establish an RDMA connections between two nodes).

4. ib_send_bw :
performance test

5. ucmatose:
This covers RDMA_CM RC connections, but only userspace (same as rping)
(It establishes a set of reliable RDMA connections between two nodes using the librdmacm, optionally transfers data between the nodes, then disconnects).

6. krping :
The krping module is a kernel loadable module that utilizes the Open Fabrics verbs to implement a client/server ping/pong program.

7. ibv_uc_pingpong :
This command runs a simple ping-pong test. need to use -g option.
The command is a client-server, in that a remote node is configured as a server, while a local node performs as a client.

8. ibv_rc_pingpong :
Run a simple ping-pong test via the reliable connected (RC) transport. need to use -g option.
The command is a client-server, in that a remote node is configured as a server, while a local node performs as a client.

9. ibv_srq_pingpong :
This command runs a simple ping-pong test through the Reliable Connection transport, using multiple queue pairs and the single Shared Receive Queue.
The command is a client-server, in that a remote node is configured as a server, while a local node performs as a client

10. mckey :
Ping should happen successfully between client and server.

11. ibv_ud_pingpong :
Conducts a simple Unreliable Datagram transport test.

12. ucamatose :
Establishes a set of reliable RDMA connections between two nodes using the librdma...

Read more...

We have just worked on pulling the new and combined rdma packages into main, see bug 1732892
I will try to build a version with that support enabled in qemu, but I'd need IBM to evaluate this if it is working for you as I lack the required HW setup to e.g. do an rdma migration.

I'll ping here once a experimental build is available (or if there are any show stoppers).

tags: added: qemu-18.04
Changed in qemu (Ubuntu):
status: Incomplete → In Progress
assignee: David Britton (davidpbritton) → ChristianEhrhardt (paelzer)
Manoj Iyer (manjo) on 2018-02-05
Changed in ubuntu-power-systems:
status: Incomplete → In Progress

qemu 2.11 is in proposed

Changed in qemu (Ubuntu):
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :
Download full text (5.9 KiB)

This bug was fixed in the package qemu - 1:2.11+dfsg-1ubuntu1

---------------
qemu (1:2.11+dfsg-1ubuntu1) bionic; urgency=medium

  * Merge with Debian testing, among other fixes this includes
    - fix fatal error on negative maxcpus (LP: #1722495)
    - fix segfault on dump-guest-memory on guests without memory (LP: #1723381)
    - linux user threading issues (LP: #1350435)
    - TOD-Clock Epoch Extension Support on s390x (LP: #1732691)
    Remaining changes:
    - qemu-kvm to systemd unit
      - d/qemu-kvm-init: script for QEMU KVM preparation modules, ksm,
        hugepages and architecture specifics
      - d/qemu-kvm.service: systemd unit to call qemu-kvm-init
      - d/qemu-system-common.install: install systemd unit and helper script
      - d/qemu-system-common.maintscript: clean old sysv and upstart scripts
      - d/qemu-system-common.qemu-kvm.default: defaults for
        /etc/default/qemu-kvm
      - d/rules: install /etc/default/qemu-kvm
    - Enable nesting by default
      - set nested=1 module option on intel. (is default on amd)
      - re-load kvm_intel.ko if it was loaded without nested=1
      - d/p/ubuntu/expose-vmx_qemu64cpu.patch: expose nested kvm by default
        in qemu64 cpu type.
      - d/p/ubuntu/enable-svm-by-default.patch: Enable nested svm by default
        in qemu64 on amd
    - libvirt/qemu user/group support
      - qemu-system-common.postinst: remove acl placed by udev, and add udevadm
        trigger.
      - qemu-system-common.preinst: add kvm group if needed
    - Distribution specific machine type
      - d/p/ubuntu/define-ubuntu-machine-types.patch: define distro machine
        types to ease future live vm migration.
      - d/qemu-system-x86.NEWS Info on fixed machine type defintions
    - improved dependencies
      - Make qemu-system-common depend on qemu-block-extra
      - Make qemu-utils depend on qemu-block-extra
      - let qemu-utils recommend sharutils
    - s390x support
      - Create qemu-system-s390x package
      - Include s390-ccw.img firmware
      - Enable numa support for s390x
    - ppc64[le] support
      - d/qemu-system-ppc.links provide usr/bin/qemu-system-ppc64le symlink
    - arch aware kvm wrappers
  * Added Changes
    - update VCS-git to match the bionic branch
    - sdl2 is yet too unstable for the LTS Ubuntu release given the reports
      we still see upstream and in Debian - furthermore sdl2 isn't in main yet,
      so we revert related changes to stick with the proven for now:
      - 0fd25810 - do not build-depend on libx11-dev (libsdl2-dev already
                   depends on it)
      - 9594f820 - switch from sdl1.2 to sdl2 (#870025)
    - d/qemu-system-x86.README.Debian: document intention of nested being
      default is comfort, not full support
    - update Ubuntu machine types for qemu 2.11
    - qemu-guest-agent: freeze-hook fixes (LP: #1484990)
      - d/p/guest-agent-freeze-hook-skip-dpkg-artifacts.patch
      - d/qemu-guest-agent.install: provide /etc/qemu/fsfreeze-hook
      - d/qemu-guest-agent.dirs: provide /etc/qemu/fsfreeze-hook.d
    - Create and install pxe netboot images for KVM s390x (LP: #1732094)
      - d/rules enable install s390x-netbo...

Read more...

Changed in qemu (Ubuntu):
status: Fix Committed → Fix Released
Manoj Iyer (manjo) on 2018-02-12
Changed in ubuntu-power-systems:
status: In Progress → Fix Released
bugproxy (bugproxy) on 2018-04-19
tags: added: targetmilestone-inin1810
removed: targetmilestone-inin1804
bugproxy (bugproxy) on 2018-08-17
tags: added: targetmilestone-inin1804
removed: targetmilestone-inin1810

I beg your pardon, it was released in 18.04 - why change the milestone to 18.10?

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers