The unshelve notification functional sample test fails intermittently

Bug #1835070 reported by Balazs Gibizer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Low
Balazs Gibizer
Pike
Fix Released
Low
Balazs Gibizer
Queens
Fix Committed
Low
Balazs Gibizer
Rocky
Fix Committed
Low
Balazs Gibizer
Stein
Fix Committed
Low
Balazs Gibizer

Bug Description

The notification sample test for unshelve waits for the instance to reach ACTIVE state and then asserts if unshelve.end notification is emitted properly. However the instance.vm_state is set to ACTIVE earlier[1] than emitting the unshelve.end notification[2]. This can cause two different test case failures.

1) _test_unshelve_server() fails with no ushelve.end notification received.

2) _test_shelve_and_shelve_offload_server() also has a unshelve action at the end and that test step also only waits for the the ACTIVE state. So the unshelve.end notification from the end of _test_shelve_and_shelve_offload_server() can bleed into the _test_unshelve_server() step causing that it receive one more notifications.

[1] https://github.com/openstack/nova/blob/5c6c1f8fce7cd976dedc0a1ad28836ed87af2780/nova/compute/manager.py#L5322-L5326
[2] https://github.com/openstack/nova/blob/5c6c1f8fce7cd976dedc0a1ad28836ed87af2780/nova/compute/manager.py#L5329
[3] https://github.com/openstack/nova/blob/5c6c1f8fce7cd976dedc0a1ad28836ed87af2780/nova/tests/functional/notification_sample_tests/test_instance.py#L836

Tags: testing
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

I got reports that both cases happening in a downstream env and I can reproduce it locally. But I did not found occurences of this fault in logstash

Changed in nova:
assignee: nobody → Balazs Gibizer (balazs-gibizer)
status: New → In Progress
Matt Riedemann (mriedem)
tags: added: testing
Changed in nova:
importance: Undecided → Low
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/668675
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=40f1e7c4c22dc6023614b34f28bb7fc416b668a8
Submitter: Zuul
Branch: master

commit 40f1e7c4c22dc6023614b34f28bb7fc416b668a8
Author: Balazs Gibizer <email address hidden>
Date: Tue Jul 2 14:56:40 2019 +0200

    Stabilize unshelve notification sample tests

    The notification sample test for unshelve waits for the instance to
    reach ACTIVE state and then asserts if unshelve.end notification is
    emitted properly. However the instance.vm_state is set to ACTIVE earlier[1]
    than emitting the unshelve.end notification[2]. This can cause two
    different test case failure.

    1) _test_unshelve_server() fails with no ushelve.end notification
    received.

    2) _test_shelve_and_shelve_offload_server() also has a unshelve action
    at the end and that test step also only waits for the the ACTIVE state.
    So the unshelve.end notification from the end of
    _test_shelve_and_shelve_offload_server() can bleed into the
    _test_unshelve_server() step causing that it receive one more
    notifications.

    So this patch adds an extra
    self._wait_for_notification('instance.unshelve.end') call to each test
    step to prevent the instability.

    [1] https://github.com/openstack/nova/blob/5c6c1f8f/nova/compute/manager.py#L5322-L5326
    [2] https://github.com/openstack/nova/blob/5c6c1f8f/nova/compute/manager.py#L5329
    [3] https://github.com/openstack/nova/blob/5c6c1f8f/nova/tests/functional/notification_sample_tests/test_instance.py#L836

    Closes-Bug: #1835070

    Change-Id: Ie217523a8969326b27930d7f74e50e9b352ab7a1

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/668806

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/stein)

Reviewed: https://review.opendev.org/668806
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3bc6ff029ff24083844db363010988d8d08cff00
Submitter: Zuul
Branch: stable/stein

commit 3bc6ff029ff24083844db363010988d8d08cff00
Author: Balazs Gibizer <email address hidden>
Date: Tue Jul 2 14:56:40 2019 +0200

    Stabilize unshelve notification sample tests

    The notification sample test for unshelve waits for the instance to
    reach ACTIVE state and then asserts if unshelve.end notification is
    emitted properly. However the instance.vm_state is set to ACTIVE earlier[1]
    than emitting the unshelve.end notification[2]. This can cause two
    different test case failure.

    1) _test_unshelve_server() fails with no ushelve.end notification
    received.

    2) _test_shelve_and_shelve_offload_server() also has a unshelve action
    at the end and that test step also only waits for the the ACTIVE state.
    So the unshelve.end notification from the end of
    _test_shelve_and_shelve_offload_server() can bleed into the
    _test_unshelve_server() step causing that it receive one more
    notifications.

    So this patch adds an extra
    self._wait_for_notification('instance.unshelve.end') call to each test
    step to prevent the instability.

    [1] https://github.com/openstack/nova/blob/5c6c1f8f/nova/compute/manager.py#L5322-L5326
    [2] https://github.com/openstack/nova/blob/5c6c1f8f/nova/compute/manager.py#L5329
    [3] https://github.com/openstack/nova/blob/5c6c1f8f/nova/tests/functional/notification_sample_tests/test_instance.py#L836

    Closes-Bug: #1835070

    Change-Id: Ie217523a8969326b27930d7f74e50e9b352ab7a1
    (cherry picked from commit 40f1e7c4c22dc6023614b34f28bb7fc416b668a8)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/669118

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/rocky)

Reviewed: https://review.opendev.org/669118
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=a85ce04fa39e60e672e4fa2d7912f6880079c6ef
Submitter: Zuul
Branch: stable/rocky

commit a85ce04fa39e60e672e4fa2d7912f6880079c6ef
Author: Balazs Gibizer <email address hidden>
Date: Tue Jul 2 14:56:40 2019 +0200

    Stabilize unshelve notification sample tests

    The notification sample test for unshelve waits for the instance to
    reach ACTIVE state and then asserts if unshelve.end notification is
    emitted properly. However the instance.vm_state is set to ACTIVE earlier[1]
    than emitting the unshelve.end notification[2]. This can cause two
    different test case failure.

    1) _test_unshelve_server() fails with no ushelve.end notification
    received.

    2) _test_shelve_and_shelve_offload_server() also has a unshelve action
    at the end and that test step also only waits for the the ACTIVE state.
    So the unshelve.end notification from the end of
    _test_shelve_and_shelve_offload_server() can bleed into the
    _test_unshelve_server() step causing that it receive one more
    notifications.

    So this patch adds an extra
    self._wait_for_notification('instance.unshelve.end') call to each test
    step to prevent the instability.

    [1] https://github.com/openstack/nova/blob/5c6c1f8f/nova/compute/manager.py#L5322-L5326
    [2] https://github.com/openstack/nova/blob/5c6c1f8f/nova/compute/manager.py#L5329
    [3] https://github.com/openstack/nova/blob/5c6c1f8f/nova/tests/functional/notification_sample_tests/test_instance.py#L836

    Conflicts:
          nova/tests/functional/notification_sample_tests/test_instance.py
    Conflicts due to:
    * I019e88fabd1d386c0d6395a7b1969315873485fd

    Closes-Bug: #1835070

    Change-Id: Ie217523a8969326b27930d7f74e50e9b352ab7a1
    (cherry picked from commit 40f1e7c4c22dc6023614b34f28bb7fc416b668a8)
    (cherry picked from commit 3bc6ff029ff24083844db363010988d8d08cff00)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/674636

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.opendev.org/674636
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8afc39a2c748a8fde070493dd5633788f782c69a
Submitter: Zuul
Branch: stable/queens

commit 8afc39a2c748a8fde070493dd5633788f782c69a
Author: Balazs Gibizer <email address hidden>
Date: Tue Jul 2 14:56:40 2019 +0200

    Stabilize unshelve notification sample tests

    The notification sample test for unshelve waits for the instance to
    reach ACTIVE state and then asserts if unshelve.end notification is
    emitted properly. However the instance.vm_state is set to ACTIVE earlier[1]
    than emitting the unshelve.end notification[2]. This can cause two
    different test case failure.

    1) _test_unshelve_server() fails with no ushelve.end notification
    received.

    2) _test_shelve_and_shelve_offload_server() also has a unshelve action
    at the end and that test step also only waits for the the ACTIVE state.
    So the unshelve.end notification from the end of
    _test_shelve_and_shelve_offload_server() can bleed into the
    _test_unshelve_server() step causing that it receive one more
    notifications.

    So this patch adds an extra
    self._wait_for_notification('instance.unshelve.end') call to each test
    step to prevent the instability.

    [1] https://github.com/openstack/nova/blob/5c6c1f8f/nova/compute/manager.py#L5322-L5326
    [2] https://github.com/openstack/nova/blob/5c6c1f8f/nova/compute/manager.py#L5329
    [3] https://github.com/openstack/nova/blob/5c6c1f8f/nova/tests/functional/notification_sample_tests/test_instance.py#L836

    Conflicts:
          nova/tests/functional/notification_sample_tests/test_instance.py
    Conflicts due to:
    * I1a0afa0e8740c229db77c18b932e316196880de5

    Closes-Bug: #1835070

    Change-Id: Ie217523a8969326b27930d7f74e50e9b352ab7a1
    (cherry picked from commit 40f1e7c4c22dc6023614b34f28bb7fc416b668a8)
    (cherry picked from commit 3bc6ff029ff24083844db363010988d8d08cff00)
    (cherry picked from commit a85ce04fa39e60e672e4fa2d7912f6880079c6ef)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 19.0.2

This issue was fixed in the openstack/nova 19.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 18.2.2

This issue was fixed in the openstack/nova 18.2.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.opendev.org/676677

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.12

This issue was fixed in the openstack/nova 17.0.12 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 20.0.0.0rc1

This issue was fixed in the openstack/nova 20.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/pike)

Reviewed: https://review.opendev.org/676677
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d89a0d1f96df1cfe4c9e8025b147508315ca239d
Submitter: Zuul
Branch: stable/pike

commit d89a0d1f96df1cfe4c9e8025b147508315ca239d
Author: Balazs Gibizer <email address hidden>
Date: Tue Jul 2 14:56:40 2019 +0200

    Stabilize unshelve notification sample tests

    The notification sample test for unshelve waits for the instance to
    reach ACTIVE state and then asserts if unshelve.end notification is
    emitted properly. However the instance.vm_state is set to ACTIVE earlier[1]
    than emitting the unshelve.end notification[2]. This can cause two
    different test case failure.

    1) _test_unshelve_server() fails with no ushelve.end notification
    received.

    2) _test_shelve_and_shelve_offload_server() also has a unshelve action
    at the end and that test step also only waits for the the ACTIVE state.
    So the unshelve.end notification from the end of
    _test_shelve_and_shelve_offload_server() can bleed into the
    _test_unshelve_server() step causing that it receive one more
    notifications.

    So this patch adds an extra
    self._wait_for_notification('instance.unshelve.end') call to each test
    step to prevent the instability.

    [1] https://github.com/openstack/nova/blob/5c6c1f8f/nova/compute/manager.py#L5322-L5326
    [2] https://github.com/openstack/nova/blob/5c6c1f8f/nova/compute/manager.py#L5329
    [3] https://github.com/openstack/nova/blob/5c6c1f8f/nova/tests/functional/notification_sample_tests/test_instance.py#L836

    Conflicts:
          nova/tests/functional/notification_sample_tests/test_instance.py
    Conflicts due to:
    * Ie7a886f5b389c2f0bb9dc66129e4562cc09ba1b5

    Also in pike there is no _test_shelve_and_shelve_offload_server step but
    two distinct steps _test_shelve_server and _test_shelve_offload_server so
    this patch needed to be extened on pike. In the_test_shelve_server step
    we need to wait for instance.power_on.end instead of instance.unshelve.end
    as unshelve without offload just powers on the instance.

    Closes-Bug: #1835070

    Change-Id: Ie217523a8969326b27930d7f74e50e9b352ab7a1
    (cherry picked from commit 40f1e7c4c22dc6023614b34f28bb7fc416b668a8)
    (cherry picked from commit 3bc6ff029ff24083844db363010988d8d08cff00)
    (cherry picked from commit a85ce04fa39e60e672e4fa2d7912f6880079c6ef)
    (cherry picked from commit 8afc39a2c748a8fde070493dd5633788f782c69a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova pike-eol

This issue was fixed in the openstack/nova pike-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.