[SRU] connection problems under load with hardy dovecot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
dovecot (Ubuntu) |
Fix Released
|
Medium
|
Mathias Gug | ||
Hardy |
Fix Released
|
Medium
|
Mathias Gug | ||
Intrepid |
Fix Released
|
Medium
|
Mathias Gug |
Bug Description
Binary package hint: dovecot
Hi,
We recently upgraded our mail server to a (backported-
dovecot because we wanted SSL certificate chaining support but we had
to revert back to dapper-dovecot the next day because the hardy
version was failing under load. Unfortunately there was no obvious
pattern to the failures and nothing helpful in the server-side logs.
The symptoms on the client side was MUAs timing out or simply not
picking up new mail. Turning on IMAP level debugging on the client
side wasn't useful either. Just before the revert, it got so bad that
even a simple openssl s_client -connect to the dovecot server was
hanging.
Even more unfortunately, I can't easily reproduce this problem. I
can't afford to break our production mail server again, and this
problem wasn't evident in the smaller scale testing we did prior to
the upgrade. I tried battering a test hardy-dovecot instance with
groups of openssl s_client's in a for loop, but that wasn't sufficient
to reproduce the problem.
--
James
Related branches
Changed in dovecot: | |
assignee: | nobody → mathiaz |
Changed in dovecot: | |
milestone: | none → ubuntu-8.04 |
Changed in dovecot: | |
milestone: | ubuntu-8.04 → ubuntu-8.04.1 |
Changed in dovecot: | |
importance: | Undecided → Medium |
milestone: | none → ubuntu-8.04.1 |
status: | New → In Progress |
assignee: | nobody → mathiaz |
Changed in dovecot: | |
milestone: | none → ubuntu-8.04.1 |
Changed in dovecot: | |
status: | Fix Committed → Confirmed |
While using the qa-regression- testing scripts, I saw an SSL connection hang once while doing a batch of test runs. When it hung, I had to kill -9 the server to get it to drop the port. Before seeing this bug, I assumed I had just done something goofy, especially since I never saw the issue again after the first failure.