Gearman Jobserver Failover

Bug #585054 reported by Felix Gorodishter
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Gearman
Fix Released
Undecided
Clint Byrum

Bug Description

It appears that currently libgearman does not allow for proper failover to utilize a second jobserver.

I run two gearman job servers and if i take the first one down, the jobs simply fail with an error of:
   gearman_connection_flush:could not connect

(For what it's worth, I use the gearman PHP extension, currently on version 0.7.0, to interface with gearman)

It appears the error is coming from line 488 of libgearman/connection.c where the connection->addrinfo_next is NULL.

Related branches

description: updated
Revision history for this message
Clint Byrum (clint-fewbar) wrote :

I've seen this affect too, and I think the logic is slightly off for the failover in some instances, but not all.

Changed in gearmand:
status: New → Confirmed
Revision history for this message
Clint Byrum (clint-fewbar) wrote :

Merge proposal submitted.. I think this is due to the connection code checking for POLLOUT before POLLERR, so it wrongly assumes that a socket has been connected because it comes back with both POLLOUT and POLLERR revents.

Changed in gearmand:
assignee: nobody → Clint Byrum (clint-fewbar)
Revision history for this message
Brian Aker (brianaker) wrote :

Clint's fix is in trunk.

Changed in gearmand:
status: Confirmed → Fix Committed
Brian Aker (brianaker)
Changed in gearmand:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.