Reduce overhead for redundant PartitionState events
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
nova-powervm |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Changes were made to pypowervm in https:/
One particular example of this is the handling of PartitionState events in nova_powervm/
The PowerVMLifecycl
I would recommend the following changes:
1) Eliminate the temporary instance caching in nova_powervm/
2) Don't actually retrieve the instance from nova until we are actually going to issue a notification. For NVRAM events, this would be just before calling the nvram_mgr. For PartitionState events, the instance shouldn't be retrieved until just before we call _driver.emit_event from the PowerVMLifecycl
3) Either treat all PartitionState events as "delayed" events (allowing several seconds for new events to replace old ones before actually emitting an event) or maintain a cache of the last observed PartitionState for each LPAR, and only retrieve the instance and emit the event if the new state is different. Since we will typically only see a sequence of PartitionState changes for a small percentage of LPARs, this cache could be either fixed-size or use a time-based eviction policy to limit memory consumption and so we don't have to monitor deletes.
I'm sure there are other implementation alternatives. The key is to reduce the number of calls to vm.get_instance so that the controller doesn't get bogged down with large numbers of events.
Reviewed: https:/ /review. openstack. org/469982 /git.openstack. org/cgit/ openstack/ nova-powervm/ commit/ ?id=db759ce5158 446a918e76c049d 0efc753e0bbc72
Committed: https:/
Submitter: Jenkins
Branch: master
commit db759ce5158446a 918e76c049d0efc 753e0bbc72
Author: Eric Fried <email address hidden>
Date: Thu Jun 1 14:10:48 2017 -0500
Performance improvements for Lifecycle events
Implement various performance improvements in the event handler.
- Since get_instance is expensive, delay it as long as possible (see #2
in the bug report). Only retrieve the instance right before we're
going to use it.
- Delay all PartitionState events (see #3 in the bug report).
- Skip PartitionState- driven events entirely if nova is in the middle of
an operation, since nova is already aware of the appropriate state
changes.
- Only retrieve the admin context once, and cache it.
We keep the instance cache (see #1 in the bug report) since scale
testing showed it was indeed being used a nontrivial amount of the time.
Change-Id: I1f1634215b4c26 9842584c59f2c14 c119c282b7e
Closes-Bug: #1694784