misbehaving backend services are able to DOS external services

Bug #703873 reported by Tom Haddon
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar Subversion Plugin
Triaged
High
Unassigned
Launchpad CSCVS
Triaged
Low
Unassigned
Launchpad code imports
Triaged
Low
Unassigned

Bug Description

Symptoms
========

Launchpad backend services like the code importer, bug watchers, tarball finders are all expected to communicate with external sites at fairly high frequencies.

We will from time to time have bugs in those services that result in excessively high attempted requests being made to those sites. When this happens we generally get firewalled which is unpleasant for everyone.

Solution
========

As part of being good net citizens we need to ensure that when that happens our services are throttled in some fashion. Simple request count throttling won't cover all requests (e.g. asking svn to perform an expensive server side operation may still be an issue) but its an improvement over our current catchall (revisions imported in one session are throttled).

Implementation
==============

We probably want to use an external catchall - e.g. squid or something - so that many different services can all be throttled in some sensible fashion.

bzr-svn and cscvs also speak svn: and pserver: which means that we either need a proxy protocol implementation for them or some in-process policy layer (that doesn't lose state when a specific import worker fails).

Tom Haddon (mthaddon)
Changed in launchpad:
importance: Undecided → Critical
tags: added: canonical-losa-lp
description: updated
Revision history for this message
Robert Collins (lifeless) wrote : Re: no safety net for misbehaved backend services

There is a separate bug (I don't have the number handy sorry) about the specific behaviour in bzr-svn. The way that dos occured was a bzr-import-export triggering a retry of a known-failing import to see if it worked with the updates made to bzr-svn over the last while. (It didn't).

Our normal protections *did* operate : we didn't run the import too often, and we didn't import too many revisions, but we encountered a failure mode on a different dimension : importing a single revision went beserk. I'm repurposing this bug to cover a more general catchall to protect against unknown bugs like this (was) and other such things.

Changed in launchpad:
importance: Critical → High
status: New → Triaged
summary: - code import servers DoS-ing third party services
+ no safety net for misbehaved backend services
description: updated
Changed in launchpad-cscvs:
status: New → Triaged
importance: Undecided → High
Changed in bzr-svn:
importance: Undecided → High
summary: - no safety net for misbehaved backend services
+ misbehaving backend services are able to DOS external services
Revision history for this message
Robert Collins (lifeless) wrote :

bzr-import-*expert* - sorry for the confusion.

Revision history for this message
Martin Pool (mbp) wrote :

I wonder if it would be easy to do this using iptables on an outbound firewall. I think you can fairly easily set a maximum permitted SYN rate per source/destination hosts. If that works, it wouldn't need code changes and it would be hard for outgoing services to accidentally subvert.

Jelmer Vernooij (jelmer)
Changed in bzr-svn:
status: New → Triaged
assignee: nobody → Jelmer Vernooij (jelmer)
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

At the moment the importers will already limit the number of revisions they process each time.

Bazaar has a traffic counter (used for progress reporting mainly) that bzr core, bzr-svn, bzr-git and bzr-hg all feed into. It should be possible to use this to kill the importer if there was too much traffic.

Similarly, a simple time-based watch dog might also be useful. How CPU vs network intensive some of these operations are varies wildly though, so I'm not sure how useful this would be in practice.

Jelmer Vernooij (jelmer)
Changed in bzr-svn:
assignee: Jelmer Vernooij (jelmer) → nobody
Revision history for this message
Robert Collins (lifeless) wrote :

While this is still relevant to losas we haven't had an operational issue of this sort for a good long time.

Changed in launchpad-cscvs:
importance: High → Low
Changed in launchpad:
importance: High → Low
Colin Watson (cjwatson)
affects: launchpad → lp-codeimport
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.