Bug #1785841 “Event execution failures for back to back leases” : Bugs : Blazar

Pierre Riteau (priteau) on 2018-08-07

Changed in blazar:
assignee:	nobody → Pierre Riteau (priteau)
importance:	Undecided → High

Revision history for this message

Masahito Muroi (muroi-masahito) wrote on 2018-08-08:

#1

I imagine one of the purposes of cleaning time BP is resolving this issue. The first usecase of the new feature comes from Ironic usecase, but basically it fixes this issue, too.

https://blueprints.launchpad.net/blazar/+spec/cleaning-time-allowance

Revision history for this message

Masahito Muroi (muroi-masahito) wrote on 2018-08-08:

#2

Oops, I mean we've already fixed the bug and don't need to care it.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-08: Fix proposed to blazar (master)

#3

Fix proposed to branch: master
Review: https://review.openstack.org/589899

Changed in blazar:
status:	New → In Progress

Revision history for this message

Pierre Riteau (priteau) wrote on 2018-08-08:

#4

Cleaning time would help, but it's not enabled by default. Please check my patch instead: it fixes other issues, such as running before_end_lease after start_lease has completed.

Revision history for this message

Pierre Riteau (priteau) wrote on 2018-09-10:

#5

I will push an updated patch.

Pierre Riteau (priteau) on 2018-10-15

Changed in blazar:
milestone:	none → stein-1

Pierre Riteau (priteau) on 2018-10-23

Changed in blazar:
milestone:	stein-1 → stein-2

Pierre Riteau (priteau) on 2019-01-10

Changed in blazar:
milestone:	stein-2 → stein-3

Pierre Riteau (priteau) on 2019-04-15

Changed in blazar:
milestone:	stein-3 → train-1

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-02-24: Fix merged to blazar (master)

#6

Reviewed: https://review.opendev.org/c/openstack/blazar/+/589899
Committed: https://opendev.org/openstack/blazar/commit/c92edb8a177de51862ad2a4f9cbac2c50d31ef84
Submitter: "Zuul (22348)"
Branch: master

commit c92edb8a177de51862ad2a4f9cbac2c50d31ef84
Author: Pierre Riteau <email address hidden>
Date: Wed Aug 8 12:46:28 2018 +0200

Prevent conflicting events from running concurrently

    If two leases have compute hosts in common, and the second lease starts
    exactly when the first lease ends, there is the possibility of a race.
    The Blazar manager can first run the start_lease event of the second
    lease. This event would fail since the end_lease event of the first
    lease would still be UNDONE, and the compute hosts in common would still
    be in the aggregate associated with the first lease, instead of being in
    the freepool.

This patch changes event execution code so that events are executed
concurrently if possible, with the following constraints:

    - events are executed strictly in order, i.e. events are started only
      after all previous events have completed
    - when events are at the same time, we first execute before_end_lease
      events (unless there is a start_lease at the same time), then
      end_lease events, followed by start_lease events, ensuring the bug
      described above does not happen. Finally, we run any before_end_lease
      which had a corresponding start_lease event at the same time.

It also has the side effect of providing better stack traces for event
execution failures, since we call wait() on all GreenThread objects.

    Co-Authored-By: Jason Anderson <email address hidden>
    Change-Id: Ie2339db18e8baee379fbea082f1238ec44fca6b1
    Closes-Bug: #1785841

Changed in blazar:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-03-02: Fix proposed to blazar (stable/xena)

#7

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/blazar/+/831509

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-03-08: Fix included in openstack/blazar 9.0.0.0rc1

#8

This issue was fixed in the openstack/blazar 9.0.0.0rc1 release candidate.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-05-05: Fix merged to blazar (stable/xena)

#9

Reviewed: https://review.opendev.org/c/openstack/blazar/+/831509
Committed: https://opendev.org/openstack/blazar/commit/c3b851937ebf110184ef46d7bc2ad42e163f92b1
Submitter: "Zuul (22348)"
Branch: stable/xena

commit c3b851937ebf110184ef46d7bc2ad42e163f92b1
Author: Pierre Riteau <email address hidden>
Date: Wed Aug 8 12:46:28 2018 +0200

Prevent conflicting events from running concurrently

    If two leases have compute hosts in common, and the second lease starts
    exactly when the first lease ends, there is the possibility of a race.
    The Blazar manager can first run the start_lease event of the second
    lease. This event would fail since the end_lease event of the first
    lease would still be UNDONE, and the compute hosts in common would still
    be in the aggregate associated with the first lease, instead of being in
    the freepool.

This patch changes event execution code so that events are executed
concurrently if possible, with the following constraints:

    - events are executed strictly in order, i.e. events are started only
      after all previous events have completed
    - when events are at the same time, we first execute before_end_lease
      events (unless there is a start_lease at the same time), then
      end_lease events, followed by start_lease events, ensuring the bug
      described above does not happen. Finally, we run any before_end_lease
      which had a corresponding start_lease event at the same time.

It also has the side effect of providing better stack traces for event
execution failures, since we call wait() on all GreenThread objects.

    Co-Authored-By: Jason Anderson <email address hidden>
    Change-Id: Ie2339db18e8baee379fbea082f1238ec44fca6b1
    Closes-Bug: #1785841
    (cherry picked from commit c92edb8a177de51862ad2a4f9cbac2c50d31ef84)

tags:

added: in-stable-xena

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2023-01-18: Fix included in openstack/blazar 8.0.1

#10

This issue was fixed in the openstack/blazar 8.0.1 release.

Blazar

Event execution failures for back to back leases

Bug Description

Other bug subscribers

Remote bug watches