[NFV] Deployment with enabled SR-IOV fails: virtual functions aren't configured by netconfig

Bug #1556854 reported by Artem Panchenko on 2016-03-14
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
High
Alexander Adamov
Mitaka
High
Alexander Adamov

Bug Description

Fuel version info (9.0 liberty): http://paste.openstack.org/show/490348/

Deployment with enabled SR-IOV on compute NIC fails:

2016-03-14 08:52:40 DEBUG [60935] Node 7(sriov_iommu_check) status: error
2016-03-14 08:52:40 DEBUG [60935] Node 7 has failed to deploy. There is no more retries for puppet run.
2016-03-14 08:52:40 DEBUG [60935] {"nodes"=>[{"status"=>"error", "error_type"=>"deploy", "uid"=>"7", "role"=>"sriov_iommu_check"}]}
2016-03-14 08:52:40 DEBUG [60935] Task time summary: sriov_iommu_check with status failed on node 7 took 00:00:08

root@node-7:~# ruby /etc/puppet/modules/osnailyfacter/modular/netconfig/sriov_iommu_check.rb
ERROR: Was not able to check SR-IOV and IOMMU for eno2 interface

Steps to reproduce:

1. Create environment (Neutron VLAN)
2. Add 1 controller and 1 compute with SR-IOV compatible NICs
3. Enable SR-IOV for some NICs on compute, set sriov_numvfs to 1
4. Deploy changes

Expected result: env is deployed and passes healt checks

Actual result: deployment fails on compute (task sriov_iommu_check)

Checker for SR-IOV returns non-zero exit code because VFs aren't configured for target NICs:

http://paste.openstack.org/show/490350/

Aleksandr Didenko (adidenko) wrote :

It looks like some HW related issue, it does not allow to enable VFs:

echo 1 > /sys/class/net/eno2/device/sriov_numvfs
-bash: echo: write error: Cannot allocate memory

[ 745.038609] igb 0000:07:00.1: SR-IOV: bus number out of range

After adding "pci=assign-busses" kernel option it started to work and allowed to enable VFs. But sriov-iommu check still fails because of this:

http://paste.openstack.org/show/490354/

Aleksandr Didenko (adidenko) wrote :

HW info

Mainboard
  Manufacturer: Supermicro
  Product Name: X10SRW-F
CPU
  Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Problem NIC
  Intel Corporation I350 Gigabit Network Connection (lspci info http://paste.openstack.org/show/490358/ )

Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Aleksandr Didenko (adidenko)
Vladimir Eremin (yottatsa) wrote :

This is know bug with linux kernel and hardware platforms. sriov_iommu_check was introduced to check this case proactively. The case was described in https://bugzilla.redhat.com/show_bug.cgi?id=1113399 and there is an opinion that patch should not be used without cautions.

Bug should be marked as invalid, and the case should be covered in Know issues with appropriate solution.

tags: added: area-docs
Changed in fuel:
status: New → Invalid
Aleksandr Didenko (adidenko) wrote :

Forwarded to docs team to add this feature limitation to release notes for 9.0

tags: added: release-notes
tags: removed: team-network
Changed in fuel:
assignee: Aleksandr Didenko (adidenko) → Fuel Documentation Team (fuel-docs)
status: Invalid → Confirmed
Vladimir Eremin (yottatsa) wrote :

Differentional google for
* SR-IOV "ARIHierarchy-"
* SR-IOV "ARIHierarchy+"
shows, that ARI-capable Hierarchy bit in IOV capabilities should be set[1]: `lspci -vvv | grep 'ARIHierarchy+'`

Now I'm investigating it[2].

[1]: http://kavi.pcisig.com/developers/main/training_materials/get_document?doc_id=d9c905f5861c44c497a94f327539eb63ac49b191
[2]: https://blogs.technet.microsoft.com/jhoward/2012/03/13/everything-you-wanted-to-know-about-sr-iov-in-hyper-v-part-2/

Vladimir Eremin (yottatsa) wrote :

Next check should be implemented in agent https://etherpad.openstack.org/p/sriov-check

tags: added: area-library team-network
tags: removed: area-library
Changed in fuel:
assignee: Fuel Documentation Team (fuel-docs) → Alexander Adamov (aadamov)

Reviewed: https://review.openstack.org/304612
Committed: https://git.openstack.org/cgit/openstack/fuel-docs/commit/?id=e9d1be7df3a46bfa5d40229dcdf73c64864f1be6
Submitter: Jenkins
Branch: master

commit e9d1be7df3a46bfa5d40229dcdf73c64864f1be6
Author: Evgeny Konstantinov <email address hidden>
Date: Tue Apr 12 16:15:08 2016 +0300

    Add Fuel Mitaka known issues to relnotes
    Related-Bug: #1439776
    Related-Bug: #1450100
    Related-Bug: #1460169
    Related-Bug: #1490597
    Related-Bug: #1526544
    Related-Bug: #1556854
    Related-Bug: #1446704

    Change-Id: I3df16c163d82af7d0db8a64643b915909cabd8f1

Changed in fuel:
milestone: 9.0 → 10.0

Fix proposed to branch: master
Change author: Alexander Adamov <email address hidden>
Review: https://review.fuel-infra.org/21128

Reviewed: https://review.fuel-infra.org/21128
Submitter: Olga Gusarenko <email address hidden>
Branch: master

Commit: d5199565c1a069c0d586e78b0e7d75a05c83fc31
Author: Alexander Adamov <email address hidden>
Date: Tue May 24 10:56:25 2016

[RN MOS9.0] Fuel known and resolved issues

Adds Fuel known and resolved issues to the
MOS9.0 Release Notes:

- #1564536 The service EC2 (nova_ec2) does not exist
           in the catalog list
- #1545988 Fix idempotency of compute tasks
- #1547003 Cluster capacity calculation fix
- #1556854 [NFV] Deployment with enabled SR-IOV fails

Change-Id: I9e683cf18797a19d4d67043d7652f94a39960e42
Closes-Bug: #1564536
Closes-Bug: #1545988
Closes-Bug: #1547003
Closes-Bug: #1556854

Changed in fuel:
status: Confirmed → Fix Committed
tags: added: release-notes-done
removed: release-notes
Maksym Strukov (unbelll) on 2016-06-21
Changed in fuel:
status: Fix Committed → Fix Released
ivano (l-ivan) on 2016-08-03
tags: added: customer-found
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.