Gearman server restart can cause NullPointerException in connected workers

Bug #693477 reported by Omry Yadan on 2010-12-22
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Gearman Java
Medium
Omry Yadan

Bug Description

I noticed that when the server restarts, connected workers gets a NullPointerException and fails to reconnect to the server when it comes up again (well, because my work loop got terminated, but still).

Dec 22, 2010 6:56:09 PM org.gearman.worker.GearmanWorkerImpl work
WARNING: Received IOException while driving IO on session GearmanJobServerSession:12:GearmanNIOJobServerConnection:localhost/127.0.0.1:4730
java.io.IOException: Connection to job server severed
 at org.gearman.common.GearmanNIOJobServerConnection.read(GearmanNIOJobServerConnection.java:191)
 at org.gearman.common.GearmanJobServerSession.driveSessionIO(GearmanJobServerSession.java:199)
 at org.gearman.worker.GearmanWorkerImpl.work(GearmanWorkerImpl.java:166)
 at com.face.waxon.WaxOn$1.run(WaxOn.java:106)
Exception in thread "Worker-thread-1" java.lang.NullPointerException
 at org.gearman.common.GearmanJobServerSession.sessionHasDataToWrite(GearmanJobServerSession.java:257)
 at org.gearman.worker.GearmanWorkerImpl.work(GearmanWorkerImpl.java:131)
 at com.face.waxon.WaxOn$1.run(WaxOn.java:106)

Omry Yadan (omry) wrote :

Eric, I think I have a patch for this problem.
Whats the best way to deliver it?
would you like to have an IM chat about the changes I propose?

  Hi Omry:

Wow, thanks!

Probably best way for you to deliver is to create a branch in launchpad
and propose it for merge with the gearman-java trunk. If you want to IM
me, you can reach me at <email address hidden> on your favorite
xmpp/jabber/google-talk client.

Cheers!

Eric

On 12/22/2010 10:23 AM, Omry Yadan wrote:
> Eric, I think I have a patch for this problem.
> Whats the best way to deliver it?
> would you like to have an IM chat about the changes I propose?
>

Omry Yadan (omry) wrote :

created a branch called reconnect-bug which is preliminary fix for this.

Eric Lambert (elambert) wrote :

Hi Omry:

Thanks for posting this!

I've pulled down your branch and taken a look quick look at it (aka made sure it compiled and passed all tests :-) ).
I'll try and take a closer look at it later on today. But if i dont get to it by the end of the day, it may be a couple
days before I have another chance (the holidays are upon us and free time to spend hacking becomes a little difficult to
come by).

Cheers!

Eric

On 12/23/10 12:48 AM, Omry Yadan wrote:
> created a branch called reconnect-bug which is preliminary fix for this.
>
> ** Branch linked: lp:~omry/gearman-java/reconnect-bug
>

Omry Yadan (omry) wrote :

Hey Eric,
no problem - I am already starting to use this. if I bump into other strangeness I`ll let you know.

btw: this does not solve the other bug I reported (where the client starts while the server is down).

Eric Lambert (elambert) wrote :

Omry:

I've assigned this bug to since you've provided the fix. No need to do more than you've done. I have taken a look at your pach, looks ok so far. I keep starting to write some tests for this but then get distracted wanting to refactor a lot of the session/connection code :-)

Eric

Changed in gearman-java:
importance: Undecided → Medium
assignee: nobody → Omry Yadan (omry)
Omry Yadan (omry) wrote :

I fixed another related important bug:

if the server restarts while a worker is performing a job, clients will get into bad state (they will think they are still connected).
the fix is trivial and I already pushed into into the reconnect-bug branch.

src/org/gearman/common/GearmanNIOJobServerConnection.java:
+ if (isOpen())
+ close();
                 //TODO do something smarter here
                 throw new IOException("Connection to job server severed");

Eric Lambert (elambert) on 2012-05-05
Changed in gearman-java:
status: New → Fix Committed
Eric Lambert (elambert) on 2012-09-25
Changed in gearman-java:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers