Missing nova context during spawn

Bug #1404268 reported by Christine Wang
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Andrew Laski
Kilo
Fix Released
Undecided
Unassigned

Bug Description

The nova request context tracks a security context and other request information, including a request id that is added to log entries associated with this request. The request context is passed around explicitly in many chunks of OpenStack code. But nova/context.py also stores the RequestContext in the thread's local store (when the RequestContext is created, or when it is explicitly stored through a call to update_store). The nova logger will use an explicitly passed context, or look for it in the local.store.

A recent change in community openstack code has resulted in the context not being set for many nova log messages during spawn:

https://bugs.launchpad.net/neutron/+bug/1372049

This change spawns a new thread in nova/compute/manager.py build_and_run_instance, and the spawn runs in that new thread. When the original RPC thread created the nova RequestContext, the context was set in the thread's local store. But the context does not get set in the newly-spawned thread.

Example of log messages with missing req id during spawn:

014-12-13 22:20:30.987 18219 DEBUG nova.openstack.common.lockutils [-] Acquired semaphore "87c7fc32-042e-40b7-af46-44bff50fa1b4" lock /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:229
2014-12-13 22:20:30.987 18219 DEBUG nova.openstack.common.lockutils [-] Got semaphore / lock "_locked_do_build_and_run_instance" inner /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:271
2014-12-13 22:20:31.012 18219 AUDIT nova.compute.manager [req-bd959d69-86de-4eea-ae1d-a066843ca317 None] [instance: 87c7fc32-042e-40b7-af46-44bff50fa1b4] Starting instance...
...
2014-12-13 22:20:31.280 18219 DEBUG nova.openstack.common.lockutils [-] Created new semaphore "compute_resources" internal_lock /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:206
2014-12-13 22:20:31.281 18219 DEBUG nova.openstack.common.lockutils [-] Acquired semaphore "compute_resources" lock /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:229
2014-12-13 22:20:31.282 18219 DEBUG nova.openstack.common.lockutils [-] Got semaphore / lock "instance_claim" inner /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:271
2014-12-13 22:20:31.284 18219 DEBUG nova.compute.resource_tracker [-] Memory overhead for 512 MB instance; 0 MB instance_claim /usr/lib/python2.6/site-packages/nova/compute/resource_tracker.py:1272014-12-13 22:20:31.290 18219 AUDIT nova.compute.claims [-] [instance: 87c7fc32-042e-40b7-af46-44bff50fa1b4] Attempting claim: memory 512 MB, disk 10 GB2014-12-13 22:20:31.292 18219 AUDIT nova.compute.claims [-] [instance: 87c7fc32-042e-40b7-af46-44bff50fa1b4] Total memory: 131072 MB, used: 12288.00 MB2014-12-13 22:20:31.296 18219 AUDIT nova.compute.claims [-] [instance: 87c7fc32-042e-40b7-af46-44bff50fa1b4] memory limit not specified, defaulting to unlimited2014-12-13 22:20:31.300 18219 AUDIT nova.compute.claims [-] [instance: 87c7fc32-042e-40b7-af46-44bff50fa1b4] Total disk: 2097152 GB, used: 60.00 GB2014-12-13 22:20:31.304 18219 AUDIT nova.compute.claims [-] [instance: 87c7fc32-042e-40b7-af46-44bff50fa1b4] disk limit not specified, defaulting to unlimited
...

2014-12-13 22:20:32.850 18219 DEBUG nova.network.neutronv2.api [-] [instance: 87c7fc32-042e-40b7-af46-44bff50fa1b4] get_instance_nw_info() _get_instance_nw_info /usr/lib/python2.6/site-packages/nova/network/neutronv2/api.py:611

Proposed patch:

one new line of code at the beginning of nova/compute/manager.py _do_build_and_run_instance:

    context.update_store()

Changed in nova:
assignee: nobody → Numan Siddique (numansiddique)
Joe Gordon (jogo)
Changed in nova:
status: New → Confirmed
status: Confirmed → Triaged
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/143647

Changed in nova:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by jichenjc (<email address hidden>) on branch: master
Review: https://review.openstack.org/144531
Reason: https://review.openstack.org/#/c/160348/9/nova/context.py

removed that

Changed in nova:
assignee: Numan Siddique (numansiddique) → Qin Zhao (zhaoqin)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Numan Siddique (<email address hidden>) on branch: master
Review: https://review.openstack.org/143647
Reason: This fix is no longer required because of patch [1]

[1] https://review.openstack.org/#/c/160348/9

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → kilo-3
status: Fix Committed → Fix Released
Revision history for this message
Andrew Laski (alaski) wrote :

The message on the abandoned patch is incorrect. That patch attempted to fix two issues and one of them was handled by https://review.openstack.org/#/c/160348/9. But the issue reported here with context being lost during build_and_run_instance is still valid.

Changed in nova:
assignee: Qin Zhao (zhaoqin) → Andrew Laski (alaski)
status: Fix Released → In Progress
milestone: kilo-3 → none
Andrew Laski (alaski)
tags: added: kilo-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/171506

Changed in nova:
assignee: Andrew Laski (alaski) → Qin Zhao (zhaoqin)
Changed in nova:
assignee: Qin Zhao (zhaoqin) → Andrew Laski (alaski)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Qin Zhao (<email address hidden>) on branch: master
Review: https://review.openstack.org/171506

Andrew Laski (alaski)
tags: added: kilo-backport-potential
Revision history for this message
Joe Gordon (jogo) wrote :

I agree this is a big pain, but this isn't in progress and I don't think its worth holding up an RC for.

Revision history for this message
Andrew Laski (alaski) wrote :

This is in progress with https://review.openstack.org/#/c/171299/. Not sure why that didn't get picked up and added here.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/171299
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=90f114969715f35e8c2fdba49096324b8bc9b7e0
Submitter: Jenkins
Branch: master

commit 90f114969715f35e8c2fdba49096324b8bc9b7e0
Author: Andrew Laski <email address hidden>
Date: Tue Apr 7 12:59:41 2015 -0400

    Store context in local store after spawn_n

    After a utils.spawn_n call we lose the context for logging because it's
    not in the local thread store. This was causing log lines to miss
    context. Additionally a call to context.get_admin_context() would add
    that context to the local store making it appear that the context had
    changed during a request.

    This wraps methods being called with spawn_n so that they add the
    context to the local store immediately.

    Closes-Bug: 1404268

    Change-Id: Ia5175c9729069df3d779237ba6039cf5bc920018

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
tags: removed: kilo-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/183144

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/183144
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7f293ec49ce87ec022798003470f5903dae603fa
Submitter: Jenkins
Branch: master

commit 7f293ec49ce87ec022798003470f5903dae603fa
Author: Qin Zhao <email address hidden>
Date: Fri May 15 01:54:21 2015 +0800

    Ensure to store context in thread local after spawn/spawn_n

    https://review.openstack.org/#/c/171299/ introduces a wrapper for utils.spawn_n
    to update context in thread local store. However, there are other routines in
    driver code which calls greenthread.spawn or greenthread.spawn_n, so that they
    will not update context in thread local store. The commit adds utils.spawn as a
    new wrapper, and also make those codes call spawn/spawn_n of utils, in order to
    ensure the context of logging is correct.

    Change-Id: I3623e60c49e442e2431cf017540422aa59bc285a
    Related-Bug: 1404268

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/193649

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (stable/kilo)

Related fix proposed to branch: stable/kilo
Review: https://review.openstack.org/193653

Thierry Carrez (ttx)
Changed in nova:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/kilo)

Reviewed: https://review.openstack.org/193649
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=fc7f1ab6140093d91e3b6d91fe48f3e8bf6df219
Submitter: Jenkins
Branch: stable/kilo

commit fc7f1ab6140093d91e3b6d91fe48f3e8bf6df219
Author: Andrew Laski <email address hidden>
Date: Tue Apr 7 12:59:41 2015 -0400

    Store context in local store after spawn_n

    After a utils.spawn_n call we lose the context for logging because it's
    not in the local thread store. This was causing log lines to miss
    context. Additionally a call to context.get_admin_context() would add
    that context to the local store making it appear that the context had
    changed during a request.

    This wraps methods being called with spawn_n so that they add the
    context to the local store immediately.

    Closes-Bug: 1404268

    Change-Id: Ia5175c9729069df3d779237ba6039cf5bc920018
    (cherry picked from commit 90f114969715f35e8c2fdba49096324b8bc9b7e0)

tags: added: in-stable-kilo
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/kilo)

Reviewed: https://review.openstack.org/193653
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=48a62170065386cea6004c090fb5273ed9ec8d2b
Submitter: Jenkins
Branch: stable/kilo

commit 48a62170065386cea6004c090fb5273ed9ec8d2b
Author: Qin Zhao <email address hidden>
Date: Fri May 15 01:54:21 2015 +0800

    Ensure to store context in thread local after spawn/spawn_n

    https://review.openstack.org/#/c/171299/ introduces a wrapper for utils.spawn_n
    to update context in thread local store. However, there are other routines in
    driver code which calls greenthread.spawn or greenthread.spawn_n, so that they
    will not update context in thread local store. The commit adds utils.spawn as a
    new wrapper, and also make those codes call spawn/spawn_n of utils, in order to
    ensure the context of logging is correct.

    Change-Id: I3623e60c49e442e2431cf017540422aa59bc285a
    Related-Bug: 1404268
    (cherry picked from commit 7f293ec49ce87ec022798003470f5903dae603fa)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/211643

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/195713
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=97821a9c7e1aff420ef2de319e158630ce811551
Submitter: Jenkins
Branch: master

commit 97821a9c7e1aff420ef2de319e158630ce811551
Author: Ryan Rossiter <email address hidden>
Date: Tue Aug 11 15:10:48 2015 +0000

    Add hacking check for greenthread.spawn()

    Because greenthread.spawn() and spawn_n() are missing a nova context
    (see I3623e60c49e442e2431cf017540422aa59bc285a and
    Ia5175c9729069df3d779237ba6039cf5bc920018), nova.utils.spawn() and
    spawn_n() should be used when spawning threads. This change adds a
    hacking check to assert this is being done during pep8.

    Change-Id: Id52c30bb5ded2184d772e6026b0f04f9a0efeb55
    Related-Bug: #1404268
    Closes-Bug: #1468513

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/211643
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=6179854ae77960d4ae883349626069023b407cb0
Submitter: Jenkins
Branch: master

commit 6179854ae77960d4ae883349626069023b407cb0
Author: Ryan Rossiter <email address hidden>
Date: Thu Jun 25 14:41:49 2015 +0000

    Add hacking check for eventlet.spawn()

    Change Id52c30bb5ded2184d772e6026b0f04f9a0efeb55 added a hacking check
    for greenthread.spawn(). Since eventlet.spawn() calls
    greenthread.spawn() under the covers, it should also be checked. Because
    there are still occurrences of eventlet.spawn(), these were also cleaned
    up in order to pass the added hacking check.

    Co-Authored-By: Qin Zhao <email address hidden>

    Change-Id: Ia125b4ad5e84bf48091af5a7a483b89629f0ec31
    Related-Bug: #1404268
    Closes-Bug: #1468513

Thierry Carrez (ttx)
Changed in nova:
milestone: liberty-1 → 12.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.