DPDK: After rebooting a compute node DPDK VM's go into Error state

Bug #1550122 reported by Vinod Nair
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.0
New
Medium
Michal Dubiel
Trunk
New
Medium
Michal Dubiel

Bug Description

After rebooting a dpdk compute node VM';s on that node go into ERROR state , rather they should be in SHUTOFF state
If the VM are in ERROIR state then ADMIN need to first reset the state to active, shutdown the VM and then start the VM

If the VM is in SHUTOFF state , the admin just need to start the VM

root@cs-scale-1:/store# nova list --all-tenants
+--------------------------------------+-------------+----------------------------------+---------+------------+-------------+----------------------+
| ID | Name | Tenant ID | Status | Task State | Power State | Networks |
+--------------------------------------+-------------+----------------------------------+---------+------------+-------------+----------------------+
| f40b8ff2-0a53-491a-a33c-77a8642e53bf | DPDK-N1-VM1 | 1524c529fe7241a0ac4764c82739e509 | ERROR | - | Shutdown | CS1-NET1=141.1.0.4 |
| fb424405-dcdd-4f71-80fa-83ee2f5b76e2 | DPDK-N2-VM1 | 1524c529fe7241a0ac4764c82739e509 | ERROR | - | Shutdown | CS1-NET2=141.2.0.3 |
| 251a4fae-81e1-4931-a951-bd14e5fe8993 | SN-VM1 | 1524c529fe7241a0ac4764c82739e509 | ERROR | - | Shutdown | CS1-NET1=141.1.0.3 |
| ec81eaa1-d13e-4526-8db7-889609f00d6b | SR-N1-VM1 | 1524c529fe7241a0ac4764c82739e509 | SHUTOFF | - | Shutdown | SRIOV-NEt1=200.4.1.3 |
| b4eba1c7-19e7-453d-8dc1-fac369666b76 | SR-N1-VM2 | 1524c529fe7241a0ac4764c82739e509 | SHUTOFF | - | Shutdown | SRIOV-NEt1=200.4.1.4 |
+--------------------------------------+-------------+----------------------------------+---------+------------+---------

Build 2717 Kilo

Tags: dpdk
Revision history for this message
Michal Dubiel (md-sh) wrote :

I can not reproduce this, the VMs always change the status to shutoff after the node restart. Could you provide the testbed.py that was used for this? Perhaps it happens only in some particular configuration.

Revision history for this message
Vinod Nair (vinodnair) wrote :

It happens all the time .. I am using Kilo Builds,. Latest 3.0.2722 Kilo also has the same issue

tags: added: releasenote
Revision history for this message
Vinod Nair (vinodnair) wrote :

Testbed,py
=============
from fabric.api import env

#Management ip addresses of hosts in the cluster
host1 = 'root@10.87.140.215'
host2 = 'root@10.87.129.228'
host3 = 'root@10.87.129.234'

#External routers if any
ext_routers = [('h1droid', '8.0.0.254')]

#Autonomous system number
router_asn = 64512

#Host from which the fab commands are triggered to install and provision
host_build = 'root@10.87.140.215'

#Role definition of the hosts.
env.roledefs = {
    'all': [host1, host2, host3],
    'cfgm': [host1],
    'openstack': [host1],
    'control': [host1],
    'compute': [host2,host3],
    'collector': [host1],
    'webui': [host1],
    'database': [host1],
    'build': [host_build],
        'storage-master': [host1],
    'storage-compute': [host2]
}

env.hostnames = {
    'all': ['cmbu-gravity-06', 'cmbu-starwars-01', 'cmbu-starwars-02']
}

#Openstack admin password
env.openstack_admin_password = 'n1keenA'

env.password = 'n1keenA'
#Passwords of each host
env.passwords = {
    host1: 'n1keenA',
    host2: 'n1keenA',
    host3: 'n1keenA',
    host_build: 'n1keenA',
}

#For reimage purpose
env.ostypes = {
    host1: 'ubuntu',
    host2: 'ubuntu',
    host3: 'ubuntu',
}

#==============================
#OPTIONAL BONDING CONFIGURATION
#==============================
#Inferface Bonding
bond= {
    host1 : { 'name': 'bond0', 'member': ['p2p1','p2p2'], 'mode': '802.3ad', 'xmit_hash_policy': 'layer3+4' },
    host2 : { 'name': 'bond0', 'member': ['p3p1','p3p2'], 'mode': '802.3ad', 'xmit_hash_policy': 'layer3+4' },
    host3 : { 'name': 'bond0', 'member': ['p3p1','p3p2'], 'mode': 'balance-xor', 'xmit_hash_policy': 'layer3+4' },

}

#OPTIONAL SEPARATION OF MANAGEMENT AND CONTROL + DATA
#====================================================
#Control Interface
control_data = {
    host1 : { 'ip': '8.0.0.1/24', 'gw' : '8.0.0.254', 'device':'bond0', },
    host2 : { 'ip': '8.0.0.2/24', 'gw' : '8.0.0.254', 'device':'bond0', },
    host3 : { 'ip': '8.0.0.3/24', 'gw' : '8.0.0.254', 'device':'bond0', },

}

env.dpdk = {
    host2: { 'huge_pages' : '75', 'coremask' : '0-7'},
    host3: { 'huge_pages' : '75', 'coremask' : '0xff00'},
}

#To disable installing contrail interface rename package
env.interface_rename = False

#To use existing service_token
#service_token = 'your_token'

#Specify keystone IP
#keystone_ip = '1.1.1.1'

#Specify Keystone admin user if not same as admin
#keystone_admin_user = 'nonadmin'

#Specify Keystone admin password if not same as env.openstack_admin_password
#keystone_admin_password = 'contrail123'

#Specify Region Name
#region_name = 'RegionName'

#To enable multi-tenancy feature
#multi_tenancy = True

#To enable haproxy feature
#haproxy = True
#LBAAS
env.enable_lbaas=True

# CEILOMETER
enable_ceilometer = True

#To Enable prallel execution of task in multiple nodes
do_parallel = True

# To configure the encapsulation priority. Default: MPLSoGRE
#env.encap_priority = "'MPLSoUDP','MPLSoGRE','VXLAN'"

no longer affects: juniperopenstack
information type: Proprietary → Private
information type: Private → Proprietary
information type: Proprietary → Public
tags: removed: releasenote
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.