apache settings can yield frequent short outages

Bug #1918211 reported by Paul Collins on 2021-03-09
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Repository Cache Charm
High
Haw Loeung

Bug Description

While investigating a brief self-resolved alert for one of our cloud mirror regions, I zeroed in on this charm's Apache settings as one possible source of problems.

All of the units in the region are logging frequent scoreboard errors: "AH00288: scoreboard is full, not at MaxRequestWorkers".

Apache is configured with 1 child process which is recycled every 10000 connections:

ubuntu@ip-10-6-64-180:~$ cat /etc/apache2/conf-available/000mpm-worker.conf
# [...]
    StartServers 1
    MinSpareThreads 1280
    MaxSpareThreads 2560
    ThreadLimit 2560
    ThreadsPerChild 2560
    ServerLimit 1
    MaxRequestWorkers 2560
    MaxConnectionsPerChild 10000
ubuntu@ip-10-6-64-180:~$ _

Generally you want MaxRequestWorkers == ServerLimit * ThreadsPerChild, which is the case here.

However, if the traffic is reasonably evenly distributed across all of the backends, they could easily hit their 10,000 request limits around the same time and all be trying to cycle out at the same time, which could lead to haproxy seeing them all down, which would cause these alerts and result in service being briefly unavailable.

For the environment I was looking at, the 10,000 connenction limit per child yields the equivalent of a graceful every 7-8 minutes.

And since there is only one child, which has to wait for pending requests to finish and logging to complete before it can be restarted, meanwhile not allowing empty scoreboard slots to be used for fresh requests, we can easily end up with the main apache process seeing a full scoreboard and yet not have reached MaxRequestWorkers.

The sole child will also cause a similar problem at logrotate time, since that happens at the same time on all units (in the whole world, even!).

I think at least the following needs to be done:

 - review whether MaxConnectionsPerChild is needed at all
 - configure Apache to have more than one child process

Related branches

Revision history for this message
Paul Collins (pjdc) wrote :

This may be related to LP:1917317.

Haw Loeung (hloeung) on 2021-03-09
Changed in ubuntu-repository-cache:
status: New → In Progress
importance: Undecided → High
assignee: nobody → Haw Loeung (hloeung)
Haw Loeung (hloeung) on 2021-03-12
Changed in ubuntu-repository-cache:
status: In Progress → Fix Committed
Haw Loeung (hloeung) on 2021-03-12
Changed in ubuntu-repository-cache:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers