Deploy is unsuccessful when SR-IOV is configured only on part of nodes

Bug #1561018 reported by Mikhail Chernik
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
Critical
Sergey Kolekonov

Bug Description

Currently if SR-IOV feature is turned on at least one node, supported_pci_vendor_devs list is populated in quantum_settings of astute.yaml. This triggers SR-IOV configuration on all nodes, including those without SR-IOV enabled. As a result deployment fails with message "Deployment has failed. Critical nodes failed: Node[1]. Stopping the deployment process!"

Environment: ISO 97, hardware lab

Steps to reproduce:
* Create new cluster
* Add at least 1 controller and 2 computes to the cluster
* Enable SR-IOV on one compute
* Run deployment

Expected results:
* Cluster is deployed, SR-IOV in enabled on one compute and disabled on the other

Actual result:
* Deploy failed with following error in puppet log:

2016-03-23 13:42:53 ERR /usr/lib/ruby/vendor_ruby/puppet/parser/functions.rb:164:in `block (2 levels) in newfunction'
2016-03-23 13:42:53 ERR /etc/puppet/modules/osnailyfacter/lib/puppet/parser/functions/nic_whitelist_to_mappings.rb:13:in `block in <top (required)>'
2016-03-23 13:42:53 ERR undefined method `map' for "":String at /etc/puppet/modules/osnailyfacter/modular/openstack-network/agents/sriov.pp:15 on node node-2.domain.tld

Dmitry Klenov (dklenov)
tags: added: area-library
Changed in fuel:
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Mikhail Chernik (mchernik) wrote :

Env is passed to developer for investigation

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/296541

Changed in fuel:
status: Confirmed → In Progress
Changed in fuel:
assignee: Sergey Kolekonov (skolekonov) → Vladimir Eremin (yottatsa)
Atsuko Ito (yottatsa)
Changed in fuel:
assignee: Vladimir Eremin (yottatsa) → Sergey Kolekonov (skolekonov)
Changed in fuel:
assignee: Sergey Kolekonov (skolekonov) → Vladimir Eremin (yottatsa)
Atsuko Ito (yottatsa)
Changed in fuel:
assignee: Vladimir Eremin (yottatsa) → Sergey Kolekonov (skolekonov)
Revision history for this message
Dmitry Klenov (dklenov) wrote :

SR-IOV breaks whole cluster. Raising to critical.

Changed in fuel:
importance: High → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/296541
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=ed36c837e042d715827c607c831528caeef2c97b
Submitter: Jenkins
Branch: master

commit ed36c837e042d715827c607c831528caeef2c97b
Author: Sergey Kolekonov <email address hidden>
Date: Wed Mar 23 18:25:35 2016 +0300

    Fix the way SRIOV is checked on compute nodes

    supported_pci_vendor_devs can't be used as SRIOV indicator as it's a cluster
    wide variable and it's possible that some compute nodes uses SRIOV and others
    don't. nic_whitelist_to_mappings function is a better option as it relies on
    a network scheme of a current node

    Change-Id: Idd4f9b5e1bf142713b4d849e4406778ad411b3ac
    Closes-bug: #1561018

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Kristina Berezovskaia (kkuznetsova) wrote :

Verify on iso #155 (04.04.16) during first acceptance testing

I created cluster, add 1 controller 1 compute node with SR-IOV and 2 compute nodes without Sr-IOV. Deployment finished with successful result

Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.