Sysinv-api crashed during 250 subcloud deployment

Bug #1974194 reported by Iago Filipe
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Low
Iago Filipe

Bug Description

Brief Description
Many deploy-prep-failed errors were generated during the 250 subcloud deployment test which eventually took down both sysinv-api and dcmanager-manager.

Severity
Major

Steps to Reproduce
Deploy a batch of 250 virtual subclouds

Expected Behavior
This is an explorative test to identify the bottlenecks of 250 parallel subcloud deployments (without remote install). I'd expect some deployment failures but I did not expect a service crash.

Actual Behavior
Sysinv-api restarts after more or less 150 route creation requests in parallel.

Reproducibility
100% reproducible

System Configuration
Distributed Cloud
Starlingx Master
BUILD_DATE="2022-04-02 20:07:42 -0400"

Last Pass
N/A

Timestamp/Logs

Alarms

Test Activity
Developer Testing

Workaround

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/config/+/842599

Changed in starlingx:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to stx-puppet (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/starlingx/stx-puppet/+/842604

Ghada Khalil (gkhalil)
tags: added: stx.7.0 stx.config stx.distcloud
Changed in starlingx:
assignee: nobody → Iago Filipe (ifest1)
importance: Undecided → Medium
Revision history for this message
Ghada Khalil (gkhalil) wrote :

screening: lowering the priority given this is a scalability issue. This doesn't hold up the stx.7.0 release.

Changed in starlingx:
importance: Medium → Low
tags: removed: stx.7.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/c/starlingx/config/+/842599
Committed: https://opendev.org/starlingx/config/commit/dc7c3c93800a17e4031444b91880f84b0a33fb28
Submitter: "Zuul (22348)"
Branch: master

commit dc7c3c93800a17e4031444b91880f84b0a33fb28
Author: Iago Estrela <email address hidden>
Date: Thu May 19 10:49:20 2022 -0300

    Fix sysinv-api crash with 250 parallel requests

    Sysinv WSGI server is crashing after receiving 250 requests in parallel
    in route creation endpoint due to limited number of threads to handle
    requests.

    Closes-Bug: 1974194

    Test plan:
    PASS: Hit route creation endpoint 250 times in parallel and verify
          WSGI Server didn't restart.

    Signed-off-by: Iago Estrela <email address hidden>
    Change-Id: I89012d7f8c7693cd3dc078d9f67ddffb4308e254

Changed in starlingx:
status: In Progress → Fix Released
Ghada Khalil (gkhalil)
tags: added: stx.7.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on stx-puppet (master)

Change abandoned by "John Kung <email address hidden>" on branch: master
Review: https://review.opendev.org/c/starlingx/stx-puppet/+/842604
Reason: Abandon review, reason as per prior comment

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.