SM: main thread could possibly block heartbeat thread

Bug #2025504 reported by Bin Qian
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Bin Qian

Bug Description

With introduction cluster hbs info, SM sends alive pulse and query hbs cluster info via same sock on both main thread and heartbeat thread. The current synchronize mechanism could have main thread blocking more time sensitive heartbeat thread.

Tags: stx.9.0 stx.ha
Changed in starlingx:
status: New → In Progress
Revision history for this message
Ghada Khalil (gkhalil) wrote :
tags: added: stx.ha
Changed in starlingx:
assignee: nobody → Bin Qian (bqian20)
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ha (master)

Change abandoned by "Eric MacDonald <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/ha/+/887426
Reason: Another change ,that is slightly different will be posted for review instead.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ha (master)

Reviewed: https://review.opendev.org/c/starlingx/ha/+/887426
Committed: https://opendev.org/starlingx/ha/commit/d91b069daf43161c52120bc146422c2f8704bedb
Submitter: "Zuul (22348)"
Branch: master

commit d91b069daf43161c52120bc146422c2f8704bedb
Author: Bin Qian <email address hidden>
Date: Fri Jun 30 18:42:35 2023 +0000

    Avoid potential blocking of heartbeat thread

    This is to avoid waiting for hbs cluster query for sending SM alive
    pulse. When a hbs cluster query or alive pulse is being sent, do not
    queue the subsequent alive pulse, as current request being sent is good
    enough to update hbs agent.
    Also move the function retrieving sock address to initial from inside
    the query sending procedure. The function getaddrinfo to avoid indirectly
    calling malloc, which invokes malloc_atfork to potentially a blocking call.

    TCs:
       This could improve in extreme situation only, passed regression.

    Closes-bug: 2025504

    Change-Id: I520b42f0330b670e301279c2e42670d40361adc5
    Signed-off-by: Bin Qian <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
tags: added: stx.9.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.