Sending blocks all zope instances requests

Bug #920823 reported by Dylan Jay
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Singing & Dancing
New
Undecided
Unassigned

Bug Description

We've experience this in both plone 3.1.7 and plone 4.1.3. A send which takes awhile will result in blocking all requests to any instance resulting in downtime of the entire server.

Currently looking into if this is caused by a overly long 2 phase commit in the custom queue code. see http://old.nabble.com/%22Transaction-blocked-waiting-for-storage%22-td24148875.html

Revision history for this message
Daniel Nouri (daniel.nouri) wrote :

According to Thomas who implemented the custom queue code, it was added for two reasons:

"1. the singing code does some checks on queue[-1], which is really expensive with compositequeue. 2. len() is also really expensive so I added a size attr - the conflict resolution is simply to keep size sane."

If it makes instances hang, that obviously sucks, and we should look into getting rid of the custom code if that fixes the problem. And no longer use queue[-1] or len() then in S&D code.

Revision history for this message
tmog (mogensen) wrote :

The custom resolution code is based directly on the implementation in zc.queue. It only adds calculation of the queue size. I just noticed that a few fixes have been done to the zc.queue resolution code since zc.queue-1.1. These should be applied to the custom code aswell, and hopefully that will fix this issue.

Revision history for this message
Dylan Jay (t-launchpad-dylanjay-com) wrote :

I'm not sure it is the queue code. Now I'm thinking its purely zope.sendmail causing the lockup as it has custom transaction code.

Revision history for this message
Daniel Nouri (daniel.nouri) wrote :

It seems a bit unnecessary for us to use zope.sendmail in that transaction-aware mode, considering that emails are sent out in a dedicated thread that is file-locked already. I wonder how big the chance is that someone else has written to the queue and we're _not_ able to commit the transaction; because that case is disastrous...

How often are you running the 'tick_and_dispatch'?

You might want to experiment with having tick_and_dispatch process smaller batches of the queue at a time. Instead of doing the whole queue, it would do e.g. 100 emails at once, maybe every 5 minutes.

Revision history for this message
tmog (mogensen) wrote :

Are you using a reasonably new version of zope.sendmai?l - it has had some problems in the past. A really bad one was using an extreme amount of file descriptors prior to 3.4.0.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.