n-cpu.service consuming 100% of CPU indeterminately

Bug #1800204 reported by Wallace Cardoso
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Won't Fix
Undecided
Unassigned

Bug Description

Description
==============
I used fault injection to assess the robustness of the nova-conductor, and by injecting a specific sequence of failures I saw a failure that can threaten the robustness of the system. The resulting of applying these faults in the interface of nova-conductor prevent the nova-compute provisioning new instances.

Steps to reproduce
=====================
I reproduced this bug 100% from 10 attempts. I used devstack/queens.

The workload I used is of the following steps:
1) First, create a VM with the following flavor: 64MB RAM, 1 VCPU, 0 DISK; and the reference image 'cirros.0.3.4' for instance; all other settings can be the defaults of admin account;
2) Rebuild with an alternative image: for instance, 'cirros 0.4.0';
3) Rebuild with the reference image again;
4) Shelve the instance;
5) Delete the instance;

Below, I describe the faultload. For each time a fault is injected, the workload is executed from its begin. The steps are:
1) Intercept the first RPC message (i.e. AMQP) that calls for 'schedule_and_build_instances';
2) Inject the 'fault' in 'schedule_and_build_instances.args.build_requests->'nova_object.data'.instance.'nova_object.data'.flavor.'nova_object.data'.vcpus'

The pseudo-algorithm:
1. execute workload
2. for each fault in ['2', '-10000000000000000000001', '10000000000000000000000']
2.1. execute workload in parallel with faultload(fault)
3. see the CPU activity for the process n-cpu.service of devstack

Expected result
==================
nova-compute handles the faults not impacting in future requests.

Actual result
================
nova-compute consumes 100% of CPU and new instances is set to 'error' state without any clue about the issue, so it is not possible to create new instances without restarting n-cpu.service

Environment
==============
Devstack/Queens in Single Machine with defaults.

Logs & Configs
=================
Logs attached.

Revision history for this message
Wallace Cardoso (wallacec) wrote :
Revision history for this message
Wallace Cardoso (wallacec) wrote :
description: updated
description: updated
description: updated
tags: added: compute
Matt Riedemann (mriedem)
tags: added: fault-injection
Revision history for this message
Artom Lifshitz (notartom) wrote :

I'm going to say the same thing as bug 1801733 - this is super nifty and interesting, but realistically is not a concern and will most likely never get addressed.

Changed in nova:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.