WORK_EXCEPTION never forwarded to client

Bug #405732 reported by Kim Altintop
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Gearman
Invalid
Undecided
Eric Day

Bug Description

Setting the "exceptions" option does not have any effect -- worker exceptions are never forwarded to the client. We figured it's a threading bug (server_client->con->options points to empty address) and just disabled the check so the server forwards exceptions by default. We've tried with (and patched) versions 0.7, 0.8 and 0.9.

Please have a look at http://github.com/xing/gearman-server/commit/8ae1ee84a1e812dd5f6f526ec22e322c9ded4438
Of course, this is just a workaround but illustrates where things go wrong.

Revision history for this message
Eric Day (eday) wrote :

Hi! This is actually not a bug, this is the desired behavior. In order for clients to get exceptions, the client must send a OPTION_REQ packet with the "exceptions" option. This is currently deprecated and provided by the job server only for backwards compatibility with the Perl server. If you would like to send and receive exceptions, consider using a combination of WORK_WARNING and WORK_FAIL messages, since these will be a bit more portable. The protocol document describes this behavior:

http://gearman.org/index.php?id=protocol

Changed in gearmand:
status: New → Invalid
assignee: nobody → Eric Day (eday)
Revision history for this message
Kim Altintop (kim-altintop-deactivatedaccount) wrote :

Hi Eric,

actually, the observed behavior (not propagating the exceptions) is *after* having sent OPTION_REQ with "exceptions". Also, the protocol document doesn't state any deprecation note...

The WORK_WARNING/WORK_FAIL combination feels a bit awkward -- could you explain a bit the reasoning behind deprecating WORK_EXCEPTION?

Revision history for this message
Eric Day (eday) wrote :

Hi Kim,

I'm seeing the exception objects being sent back to the client. For example, with the latest Perl API from the Danga SVN server:

use Gearman::Client;
my $client = Gearman::Client->new(('exceptions' => true));
$client->job_servers('127.0.0.1:4730');
$ref= $client->do_task('reverse', 'test');
print "$$ref\n";

 INFO Accepted connection from ::7042:6300:0:0%3358815836:38061
 INFO [ 0] ::7042:6300:0:0%3358815836:38061 Connected
DEBUG [ 0] ::7042:6300:0:0%3358815836:38061 Received OPTION_REQ
DEBUG [ 0] ::7042:6300:0:0%3358815836:38061 Sent OPTION_RES
DEBUG [ 0] ::7042:6300:0:0%3358815836:38061 Received SUBMIT_JOB
DEBUG [ 0] ::7042:6300:0:0%3358815836:38061 Sent JOB_CREATED
DEBUG [ 0] ::7042:6300:0:0%3358815836:38040 Sent NOOP
DEBUG [ 0] ::7042:6300:0:0%3358815836:38040 Received GRAB_JOB
DEBUG [ 0] ::7042:6300:0:0%3358815836:38040 Sent JOB_ASSIGN
DEBUG [ 0] ::7042:6300:0:0%3358815836:38040 Received WORK_EXCEPTION
DEBUG [ 0] ::7042:6300:0:0%3358815836:38040 Received WORK_FAIL
DEBUG [ 0] ::7042:6300:0:0%3358815836:38061 Sent WORK_EXCEPTION
DEBUG [ 0] ::7042:6300:0:0%3358815836:38061 Sent WORK_FAIL
...
 INFO [ 0] ::7042:6300:0:0%3358815836:38061 Disconnected

As you can see, the C job server both receives and forwards on the WORK_EXCEPTION packet. You can confirm this with tcpdump.

The WORK_EXCEPTION packet was a bit of a hack that was added in specifically for Perl originally, which means it only sends serialized Perl exception objects. This obviously was not that portable, and would just lead to confusion in heterogeneous environments. I did not not this as deprecated in the protocol document, but lately as people have asked me about it, I tend to tell people to shy away from it. I could be wrong though. :)

One option is to re-purpose the command so that it can be used in a heterogeneous environment. Basically, don't let the API do any object serialization, and instead let the developer package and send exceptions back. I would also like to remove the need to OPTION_REQ, since this is an extra round trip just to let the job server know to forward the packets. Instead, I think it makes more sense to always forward the packets on and the API can ignore packets it doesn't understand (drop the EXCEPTION packet, but still terminate the job with a FAIL packet). Thoughts?

We should probably propose such changes to the mailing list to get feedback from a wider audience.

-Eric

Revision history for this message
Kim Altintop (kim-altintop-deactivatedaccount) wrote :

Hi Eric,

thanks for the feedback. Seems as we've completely missed the fact that a WORK_FAIL is expected after a WORK_EXCEPTION and just implemented it like a fail carrying an exception message (string!) for the ruby lib [*]. Your proposal seems more robust and compatible to me now, so I'll change the ruby lib to send the sequence WARNING (with exception message) + FAIL in case of an exception in the worker.

It would, however, be nice to see this implemented for Perl and Python as well (where Ruby and Python can't even handle a WARNING atm). Would the mailing list be the right place to discuss this? I know Dennis, but not the guy maintaining the Python API on github...

:kim

[*] Actually, this would've led to odd behavior with real "legacy" workers as we've copied the re-submit functionality from the original on_fail handler -- with the two packets in sequence, we would've rescheduled the job twice :/

Revision history for this message
Eric Day (eday) wrote :

Hi Kim,

The mailing list is certainly the best place to bring up questions to add support for other APIs. Also, there is always the option of just adding it to the API you use and submit the patch. :) I'd love to see all the APIs support the WARNING and DATA packets.

With Python, there is the current pure python API, but we now just about have a new API that wraps the C library and a twisted API. Since all new protocol support is being added to the C library now, the version that wraps this library is probably a good bet (like Dennis' Perl wrapper).

-Eric

Revision history for this message
Kim Altintop (kim-altintop-deactivatedaccount) wrote :

Hi Eric,

I've pushed WORK_WARNING support, automagic WORK_WARNING/WORK_FAIL if worker throws an exception and WORK_DATA support to our ruby fork at http://github.com/xing/gearman-ruby

While a ruby extension (C-wrapper) would probably nice in terms of catching up with the "reference API", we are not planning to move away from the pure ruby implementation for now, as there seems to be no real advantage worth the effort (e.g. performance). We'll see how things work out in our heterogeneous environment and push our tweaks (if any) to the public. I'll post any further discussion to the mailing list.

:kim

ps: I'm still convinced that the original report descibes an actual bug, though it may be an OS X issue. However, I'd be fine with a "Won't fix" resolution ;)

Revision history for this message
Eric Day (eday) wrote :

Hi Kim,

Awesome, be sure to put the Ruby API details on the gearman.org wiki (just create account/login and edit the download page). The Ruby ext that wraps the C lib is sort of "free" because we're using SWIG (get a few other dynamic languages too). It will be there if you want it, but I definitely see the value in having native language APIs too (no dependencies for cross-platform). The more options the better!

As for this original bug, could you reproduce this and dump the gearmand verbose output like I did in the previous comment? I'd like to see that the packets being sent/received when you run into the bug. I've tested this on a few platforms with the Perl exception code and have not seen an issue.

Thanks!
-Eric

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.