n-cpu under load does not uptdates it's status

Bug #1284708 reported by Attila Fazekas
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Opinion
Wishlist
Unassigned

Bug Description

screen-n-sch.txt:
2014-02-24 22:43:41.502 WARNING nova.scheduler.filters.compute_filter [req-ff34935c-c472-47df-ac4a-1286a7944b17 demo demo] ram:6577 disk:75776 io_ops:9 instances:14 has not been heard from in a while
2014-02-24 22:43:41.502 INFO nova.filters [req-ff34935c-c472-47df-ac4a-1286a7944b17 demo demo] Filter ComputeFilter returned 0 hosts
2014-02-24 22:43:41.503 WARNING nova.scheduler.driver [req-ff34935c-c472-47df-ac4a-1286a7944b17 demo demo] [instance: b5a607f0-5280-4033-ba8f-087884d41d28] Setting instance to ERROR state.

The tempest stress runner with the following example https://github.com/openstack/tempest/blob/master/tempest/stress/etc/server-create-destroy-test.json can cause this kind of load.

./tempest/stress/run_stress.py -t tempest/stress/etc/server-create-destroy-test.json -n 1024 -S

The example config uses only 8 threads, If you would like to increase the number of thread you may need to increase the demo user's quota,
or enable the use_tenant_isolation.

tempest.log:
INFO: Statistics (per process):
INFO: Process 0 (ServerCreateDestroyTest): Run 103 actions (0 failed)
INFO: Process 1 (ServerCreateDestroyTest): Run 101 actions (0 failed)
INFO: Process 2 (ServerCreateDestroyTest): Run 101 actions (0 failed)
INFO: Process 3 (ServerCreateDestroyTest): Run 100 actions (0 failed)
INFO: Process 4 (ServerCreateDestroyTest): Run 102 actions (2 failed)
INFO: Process 5 (ServerCreateDestroyTest): Run 102 actions (1 failed)
INFO: Process 6 (ServerCreateDestroyTest): Run 101 actions (0 failed)
INFO: Process 7 (ServerCreateDestroyTest): Run 101 actions (0 failed)
INFO: Summary:
INFO - 2014-02-24 22:44:22,713.713 INFO: Run 811 actions (3 failed)

Tags: compute quotas
Tracy Jones (tjones-i)
tags: added: compute
Revision history for this message
Joe Gordon (jogo) wrote :

what kind of load is n-cpu under? do you have any stats is this in the gate? Its reasonable to expect that if a system is very overloaded things won't work.

Changed in nova:
status: New → Incomplete
Revision history for this message
Joe Gordon (jogo) wrote :

marked as incomplete because unclear how loaded the system is, if the load is 'reasonable' then this could be a valid bug, otherwise I think this is just expected behavior.

Revision history for this message
Attila Fazekas (afazekas) wrote :

A single 4 core VM with 8 GiB memory system used to create-wait-active-destory servers with 8 user (worker thread).

One top output when the tests is running:
http://www.fpaste.org/81175/57776013/
The load is under 10.

Usually the actually existing qemu processes number are below 2, some cases 0. So the VM does not really consumes resources.
Many CPU time spent in short leaving commands and in the nova services.

Just with 8 users (workers) this kind of issue should not happen, the system frequently survives it, but not always.

Changed in nova:
status: Incomplete → New
Revision history for this message
melanie witt (melwitt) wrote :

Based on reporter input.

Changed in nova:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Joe Gordon (jogo) wrote :

With a load of 10 this type of thing is expected, that being said if we can fix this it would be great so moving to wishlist.

Changed in nova:
importance: Medium → Wishlist
Joe Gordon (jogo)
tags: added: quotas
Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote :

This wishlist bug has been open a year without any activity. I'm going to move it to "Opinion / Wishlist", which is an easily-obtainable queue of older requests that have come on.

Changed in nova:
status: Confirmed → Opinion
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.