nova-scheduler doesn't follow worker-multiplier config in rocky or later

Bug #1889756 reported by Nobuto Murata
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Nova Cloud Controller Charm
Fix Released
High
Nobuto Murata

Bug Description

In a deployment cloud, nova-c-c unit has more than necessary processes of nova-scheduler (80) while other processes are limited to 20 by worker-multiplier=0.25 on top of 80 CPU threads system.

It spawns 240 mysql connection from one unit, so it will be 720 connections from 3 HA units even when the cloud is idle, and potentially kills the cloud when it's loaded.

$ sudo ss -tp | grep :mysql | grep nova-scheduler -c
241

$ pgrep -af nova | cut -d ' ' -f2- | sort | uniq -c
      1 bash /etc/systemd/system/jujud-unit-hacluster-nova-1-exec-start.sh
      1 bash /etc/systemd/system/jujud-unit-nova-cloud-controller-0-exec-start.sh
     21 /usr/bin/python3 /usr/bin/nova-conductor --config-file=/etc/nova/nova.conf --log-file=/var/log/nova/nova-conductor.log
      1 /usr/bin/python3 /usr/bin/nova-consoleauth --config-file=/etc/nova/nova.conf --log-file=/var/log/nova/nova-consoleauth.log
     81 /usr/bin/python3 /usr/bin/nova-scheduler --config-file=/etc/nova/nova.conf --log-file=/var/log/nova/nova-scheduler.log
      1 /usr/bin/python3 /usr/bin/nova-spicehtml5proxy --config-file=/etc/nova/nova.conf --log-file=/var/log/nova/nova-spiceproxy.log
      1 /var/lib/juju/tools/unit-hacluster-nova-1/jujud unit --data-dir /var/lib/juju --unit-name hacluster-nova/1 --debug
      1 /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud unit --data-dir /var/lib/juju --unit-name nova-cloud-controller/0 --debug
     20 (wsgi:nova-api-os -k start
     20 (wsgi:nova_meta) -k start
     20 (wsgi:nova-placem -k start

Revision history for this message
Nobuto Murata (nobuto) wrote :

I think the equivalent config is:

[etc/nova/nova.conf.sample]
====

[scheduler]

...

#
# Number of workers for the nova-scheduler service. The default will be the
# number of CPUs available if using the "filter_scheduler" scheduler driver,
# otherwise the default will be 1.
# (integer value)
# Minimum value: 0
#workers = <None>

====

description: updated
Revision history for this message
Nobuto Murata (nobuto) wrote :

Hmm, the behavior may be different across OpenStack releases. I'm on cloud:bionic-stein.

Revision history for this message
Nobuto Murata (nobuto) wrote :

This is the commit related:
https://opendev.org/openstack/nova/commit/09898781656c987afe7019aaa63a68eda142f72e

$ git branch -r --contains 09898781656c987afe7019aaa63a68eda142f72e
  origin/HEAD -> origin/master
  origin/master
  origin/stable/rocky
  origin/stable/stein
  origin/stable/train
  origin/stable/ussuri

Nobuto Murata (nobuto)
Changed in charm-nova-cloud-controller:
assignee: nobody → Nobuto Murata (nobuto)
summary: - nova-scheduler doesn't follow worker-multiplier config
+ nova-scheduler doesn't follow worker-multiplier config in rocky or later
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-cloud-controller (master)

Fix proposed to branch: master
Review: https://review.opendev.org/744214

Changed in charm-nova-cloud-controller:
status: New → In Progress
Revision history for this message
Nobuto Murata (nobuto) wrote :

Subscribing ~field-high.

We've found that Stein based deployments tend to hit by MySQL 'Too many connections' compared to Queens based. Nova-scheduler processes ate up 720 connections (3 HA units with 40 core / 80 threads server) out of 2000 because of this issue, and it's the biggest consumer of MySQL available connections.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-nova-cloud-controller (stable/20.08)

Fix proposed to branch: stable/20.08
Review: https://review.opendev.org/747359

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-nova-cloud-controller (master)

Reviewed: https://review.opendev.org/744214
Committed: https://git.openstack.org/cgit/openstack/charm-nova-cloud-controller/commit/?id=dde75693c71fd23ae85b8d3f1ae2dfa685df3170
Submitter: Zuul
Branch: master

commit dde75693c71fd23ae85b8d3f1ae2dfa685df3170
Author: Nobuto Murata <email address hidden>
Date: Sat Aug 1 01:11:57 2020 +0900

    Set up nova-scheduler processes based on worker-multiplier

    Upstream Nova introduced multiple scheduler support in Rocky. Apply the
    number of scheduler workers based on worker-multiplier so users can
    control resource consumption instead of having the same number of
    workers with the available CPU threads.

    Change-Id: Ia6f14a98ce3e5649f290561f59d691ded3d19177
    Closes-Bug: #1889756

Changed in charm-nova-cloud-controller:
status: In Progress → Fix Committed
Liam Young (gnuoy)
Changed in charm-nova-cloud-controller:
importance: Undecided → High
Changed in charm-nova-cloud-controller:
milestone: none → 20.10
Changed in charm-nova-cloud-controller:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.