Gearman server restart can cause NullPointerException in connected workers

Bug #693477 reported by Omry Yadan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Gearman Java
Fix Released
Medium
Omry Yadan

Bug Description

I noticed that when the server restarts, connected workers gets a NullPointerException and fails to reconnect to the server when it comes up again (well, because my work loop got terminated, but still).

Dec 22, 2010 6:56:09 PM org.gearman.worker.GearmanWorkerImpl work
WARNING: Received IOException while driving IO on session GearmanJobServerSession:12:GearmanNIOJobServerConnection:localhost/127.0.0.1:4730
java.io.IOException: Connection to job server severed
 at org.gearman.common.GearmanNIOJobServerConnection.read(GearmanNIOJobServerConnection.java:191)
 at org.gearman.common.GearmanJobServerSession.driveSessionIO(GearmanJobServerSession.java:199)
 at org.gearman.worker.GearmanWorkerImpl.work(GearmanWorkerImpl.java:166)
 at com.face.waxon.WaxOn$1.run(WaxOn.java:106)
Exception in thread "Worker-thread-1" java.lang.NullPointerException
 at org.gearman.common.GearmanJobServerSession.sessionHasDataToWrite(GearmanJobServerSession.java:257)
 at org.gearman.worker.GearmanWorkerImpl.work(GearmanWorkerImpl.java:131)
 at com.face.waxon.WaxOn$1.run(WaxOn.java:106)

Revision history for this message
Omry Yadan (omry) wrote :

Eric, I think I have a patch for this problem.
Whats the best way to deliver it?
would you like to have an IM chat about the changes I propose?

Revision history for this message
Eric Lambert (elambert) wrote : Re: [Bug 693477] Re: Gearman server restart can cause NullPointerException in connected workers

  Hi Omry:

Wow, thanks!

Probably best way for you to deliver is to create a branch in launchpad
and propose it for merge with the gearman-java trunk. If you want to IM
me, you can reach me at <email address hidden> on your favorite
xmpp/jabber/google-talk client.

Cheers!

Eric

On 12/22/2010 10:23 AM, Omry Yadan wrote:
> Eric, I think I have a patch for this problem.
> Whats the best way to deliver it?
> would you like to have an IM chat about the changes I propose?
>

Revision history for this message
Omry Yadan (omry) wrote :

created a branch called reconnect-bug which is preliminary fix for this.

Revision history for this message
Eric Lambert (elambert) wrote :

Hi Omry:

Thanks for posting this!

I've pulled down your branch and taken a look quick look at it (aka made sure it compiled and passed all tests :-) ).
I'll try and take a closer look at it later on today. But if i dont get to it by the end of the day, it may be a couple
days before I have another chance (the holidays are upon us and free time to spend hacking becomes a little difficult to
come by).

Cheers!

Eric

On 12/23/10 12:48 AM, Omry Yadan wrote:
> created a branch called reconnect-bug which is preliminary fix for this.
>
> ** Branch linked: lp:~omry/gearman-java/reconnect-bug
>

Revision history for this message
Omry Yadan (omry) wrote :

Hey Eric,
no problem - I am already starting to use this. if I bump into other strangeness I`ll let you know.

btw: this does not solve the other bug I reported (where the client starts while the server is down).

Revision history for this message
Eric Lambert (elambert) wrote :

Omry:

I've assigned this bug to since you've provided the fix. No need to do more than you've done. I have taken a look at your pach, looks ok so far. I keep starting to write some tests for this but then get distracted wanting to refactor a lot of the session/connection code :-)

Eric

Changed in gearman-java:
importance: Undecided → Medium
assignee: nobody → Omry Yadan (omry)
Revision history for this message
Omry Yadan (omry) wrote :

I fixed another related important bug:

if the server restarts while a worker is performing a job, clients will get into bad state (they will think they are still connected).
the fix is trivial and I already pushed into into the reconnect-bug branch.

src/org/gearman/common/GearmanNIOJobServerConnection.java:
+ if (isOpen())
+ close();
                 //TODO do something smarter here
                 throw new IOException("Connection to job server severed");

Eric Lambert (elambert)
Changed in gearman-java:
status: New → Fix Committed
Eric Lambert (elambert)
Changed in gearman-java:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.