performance problems starting up nova process due to regex code

Bug #1790195 reported by Chris Friesen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Alex Xu

Bug Description

We noticed that nova process startup seems to take a long time. It looks like one major culprit is the regex code at https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py

Sean K Mooney highlighted one possible culprit:

<sean-k-mooney> i dont really like this https://github.com/openstack/nova/blob/master/nova/api/validation/parameter_types.py#L128-L142
<sean-k-mooney> def _get_all_chars():
<sean-k-mooney> for i in range(0xFFFF):
<sean-k-mooney> yield six.unichr(i)
<sean-k-mooney> so that is got to loop 65535 times
<sean-k-mooney> *going too
<sean-k-mooney> and we call the function 17 times
<sean-k-mooney> so that 1.1 million callse to re.escape every time we load that module

Changed in nova:
assignee: nobody → sean mooney (sean-k-mooney)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/599071

Changed in nova:
status: New → In Progress
Changed in nova:
assignee: sean mooney (sean-k-mooney) → Alex Xu (xuhj)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/599071
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=90b206894a2c442a5c475a3d90fff8f89a9b3ce0
Submitter: Zuul
Branch: master

commit 90b206894a2c442a5c475a3d90fff8f89a9b3ce0
Author: Sean Mooney <email address hidden>
Date: Fri Aug 31 21:35:14 2018 +0100

    add caching to _build_regex_range

    - _build_regex_range is called 17 times on
      import of nova.api.validation.parameters_types.
      _build_regex_range internally calls re.escape
      and valid_char on every char returned
      from _get_all_chars.
      _get_all_chars yields all chars up to 0xffff.
      As a result re.escape and valid_char are called
      1.1 million times when
      nova.api.validation.parameters_types is imported.

    - This change add a memorize decorator and uses
      it to cache _build_regex_range

    - This change does not cache valid_char,
      _is_printable or re.escape as hashing and
      caching them for each invocation would
      be far more costly both in time and memory
      than computing the result.

    Change-Id: Ic1f2c560a6da815b26fdf770450bbe439d18d4f9
    Closes-Bug: #1790195

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.0.0.0rc1

This issue was fixed in the openstack/nova 19.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.