Comment 1 for bug 1361782

Revision history for this message
Ben Shum (bshum) wrote :

Funny enough, this happened to us about two weeks ago on our production systems (master as of 2.6.1-ish era) too. Same symptoms, a library ended up doing the same search page request about 1000+ times and ate up all our apache workers on our bricks after which all the rest of the libraries were getting errors or dead pages.

We blocked that library's PC by IP, and then we found out they were still using Windows XP and kindly "suggested" that they upgrade at least to Windows 7 before we allow them back to Evergreen. We do not believe that was the real problem, but it was a good excuse at the time to get them to upgrade.

I've seen this sort of effect occur with other approaches too. Like the time we "load tested" production by pointing a small script at it to request the library home page 2000 times (which overloaded all the workers).

I've wondered if this is also something we can mitigate with more apache configuration best practices? Like adding some sort of reasonable rate limiter to requests by the same IP address so that we don't burn all our apache resources on any one person or bot.

That said if there's an Evergreen related issue, we should find that too....