mlnx infiniband mechanism error on neutron

Bug #2019999 reported by Federico Pinca
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kolla
Expired
Undecided
Unassigned

Bug Description

HI

i think there are some missing packages on ubuntu openstack zed containers that prevent the mlnx_infiniband mechanism to work and start correctly

i've not fully investigated, but with first test it seems that, on the following containers:
neutron_eswitchd
neutron_sriov_agent
neutron_server
is not present libzmq5 as apt package

and on this container:
neutron_server
is missing the pip package networking-mlnx

Ubuntu 22.04.2 LTS
5.15.0-72-generic #79-Ubuntu SMP x86_64
Docker version 23.0.6, build ef23cbc

images from:
quay.io/openstack.kolla/neutron-server:zed-ubuntu-jammy

Revision history for this message
Michal Nasiadka (mnasiadka) wrote :

Since we have no option to test it out upstream - would you be willing to provide a patch?

Changed in kolla:
status: New → Incomplete
Revision history for this message
Federico Pinca (clash00) wrote :

i'll try, i'll create a kolla dev env to build container , meanwhile i patched the container manually, in my deployment, and i'll try if sriov + infiniband works correctly in neutron with that added pkgs.

if it's ok i'll work on the patch, i think we must add on dockerfile templates the apt and pip pagkages, correct?

the errors on the container was, for
neutron_server "mlnx_infiniband mechanism not found", and the container continue restarting, after installing that pip pkgs the container seems fine
for
neutron_eswitchd "libmzq.so.5 not found" and the container crashed/restarting, after installing the apt pkgs the container seems fine

bye

Revision history for this message
Federico Pinca (clash00) wrote :

sorry , libzmq.so.5 not libmzq

Revision history for this message
Federico Pinca (clash00) wrote :

hi
to reproduce the bugs
deploy openstack with kolla-ansible

main settings to reproduce:
openstack_tag: zed-ubuntu-jammy
kolla_source_version: stable/zed
kolla_base_distro: ubuntu
enable_openstack_core: 'yes'
enable_neutron_sriov: "yes"
enable_neutron_mlnx: "yes"

also if you don't have an infiniband/ethernet mellanox with sriov enabled interface, you can see the container restarting with errors not related to the devices but for missing pkgs and lib

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for kolla because there has been no activity for 60 days.]

Changed in kolla:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.