[Dapper] linux-image-server breaks heartbeat/heartbeat-2
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux (Ubuntu) |
Invalid
|
High
|
Unassigned |
Bug Description
Using Ubuntu/dapper 6.06 LTS with all package updates.
2 Nodes with heartbeat active/passive running.
After I reboot one machine, heartbeat is started by init again.
Fine so far.
Many seconds later, heartbeat on the rebooted machine doesn't recognize itself
and restarts again... and of course is acquire resources, even the other node ows
those.
(heartbeat[4098]: ERROR: No local heartbeat. Forcing restart.)
This is a problem with the linux-image-server. If I switch over to linux-image-686
this problem is gone.
Look here if you need debug output from heartbeat.
http://
or http://
I can confirm this bug.
It was causing random lockups on my passive heartbeat server using linux-image-server (2.6.15-51). When server would restart it was unable to detect the master heartbeat server (running linux-image-686, same version), and would grab the resources. After a few minutes (it seems entirely random, anywhere from 2-20 minutes), the heartbeat server would lose a connection to itself and the master would take over again. My servers have been doing this dance all weekend long!
Neither of the boxes do anything but handle heartbeat/ ldirectord and the load is not high enough for the systems to be declaring each other dead.
After seeing that the two servers were running different kernels, I changed the offending server to use the 686 version, since it was stable, and the issue has now disappeared.
Log files can be provided if necessary.