Errors are only raised on channel close

Bug #956132 reported by Aurélien Bompard
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
txAMQP
Fix Released
Undecided
Unassigned

Bug Description

When I try to publish to a non-existant exchange, the error is silentely ignored until I close the channel.

Here's a sample code to demonstrate the bug : http://pastebin.com/MNvi9uDG

When run, you'll see it send a message to a non-existent exchange, wait 3 seconds, and close the channel. The traceback will only appear when the channel is closed.

I think the proper behavior would be to raise the exception on the basic_publish call itself.

Revision history for this message
Aurélien Bompard (aurelien-bompard) wrote :

OK, after investigating, it looks like I get the exception on channel_close() because the channel was already closed, due to the send error. The exception is raised with the last error message.

This is rather surprising from a user point of view : my basic_publish() actually fails but the deferred's callback is called, and the channel is silently closed. I think it would be more logical to get and errback on the basic_publish deferred.

Or am I doing something wrong here ?

Revision history for this message
Aurélien Bompard (aurelien-bompard) wrote :

According to the spec, the basic_publish() call is supposed to succeed every time. In case of a missing exchange, the servers closes the channel, which is what happens here. So txAMQP's behavior seems correct.

However, I can't find any way to be notified of such an error. I've attached a patch which adds a channelFailed() callback to the AMQClient class, which can then be subclassed and extended as needed. Please have a look. It changes the way AMQChannels are constructed to add a reference to AMQClient, and it changes the way channels are closed to differentiate between a wanted close and an unexpected close.

It works fine here, but please tell me if it should be reworked, of if it's just plain unacceptable :-)

It's not as nice API-wise as getting an errback on the basic_publish deferred, but I can't find a way to do that.

Revision history for this message
Esteve Fernandez (esteve) wrote :

It's a tricky problem, and as you say, it's not as nice as getting an errback from basic_publish, but the spec is pretty clear about that :-/ Your solutions works, but what do you think of using the self.client.started event to watch for errors?

In src/txamqp/client.py:109 there's this:

        self.client.started.fail_if_not_fired(Closed(reason))

maybe we could use this instead:

        self.client.started.fail(Closed(reason))

Revision history for this message
Aurélien Bompard (aurelien-bompard) wrote :

Hmm, I'm not sure that would be a good solution. The started deferred is waited on by AMQClient.start method, so it must be fired at some point on startup. And when it's fired, it can't be turned into an errback.
Since the basic_publish call can fail any time, I don't think using a "started" deferred would be a good way (looks like hijacking to me ;-)). Actually, from my point of view it really looks like an event that happens in the system and should be handled by a callback. What to do with this callback, however, is pretty unknown to me. In my application I've chosen to completely disconnect and have a ReconnectingClientFactory reconnect for me, but that's just my use case. Other people may prefer opening another channel, or link the "reason" message to the actual sent message, or whatever.
That's why I'm proposing a non-implemented callback.

Revision history for this message
Jurdanas Kriauciunas (jurdanas-kriauciunas) wrote :

I am not sure if this will help but it is possible to check for connection after publishing message by declaring predefined exchange with passive option set.
If publishing failed then declaring exchange afterwards will fail with connection closed exception.
AMQP standard (at least 0.9.1) allows this and overhead is minimum.

Revision history for this message
Aurélien Bompard (aurelien-bompard) wrote :

IIRC, after an exception any operation on the channel will fail, since the server considers it closed.
If you don't send too many messages, doing a no-op call like declaring an existing exchange can indeed be a good way to check.
Unfortunately, that's not my use case, my application sends several hundreds of messages per second, so checking each one in this manner would cause too much overhead.
But thanks for helping.

Revision history for this message
Esteve Fernandez (esteve) wrote :

I just pushed a commit with Aurélien's change, thanks!

Changed in txamqp:
status: New → Fix Committed
Changed in txamqp:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.