Critical bug in 3.4.5 can cause a quorum to not come up after leader loss

Bug #1563977 reported by Ivan Kelly
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
zookeeper (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

The bug in question
https://issues.apache.org/jira/browse/ZOOKEEPER-1697

This was running on 14.04 LTS (which has 3.4.5). When a snapshot gets large enough, a cluster will not come up because it will timeout while reading the snapshot from a newly elected leader.

This is critical since zookeeper is a basis for many HA systems, so if it doesn't survive a leader crash it becomes useless.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in zookeeper (Ubuntu):
status: New → Confirmed
Revision history for this message
Joshua Powers (powersj) wrote :

Hi,

I am cleaning out the zookeeper bugs to see if any are still valid. Trusty has reached its end of life I am not sure if there is any further work to do there. As a result I am marking this incomplete. It is unfortunate that this bug did not receive attention earlier, but the bug does appear to be fixed in these versions: 3.4.6, 3.5.0. Therefore, Xenial and newer releases should be good to go.

Thanks!

Changed in zookeeper (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for zookeeper (Ubuntu) because there has been no activity for 60 days.]

Changed in zookeeper (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.