Comment 7 for bug 1336309

Revision history for this message
Charles Butler (lazypower) wrote : Re: [Bug 1336309] Re: Scaling works in 1 direction

Thanks for working on this Amir.

So the timeout/heartbeat is the limiting factor here, and making it
configurable causes nodes to fall out faster for purposes of demonstration
of scale? Excellent findings.

On Mon, Jul 7, 2014 at 9:13 AM, amir sanjar <email address hidden>
wrote:

> Graceful datanode shutdown, as it was implemented above, is an appropriate
> method of shutting down the datanode and nodemanager to reduce HDFS and
> YARN metadata corruption. However, after lengthy discussion with HDFS
> community, it will not solve the scale-down issue reported by this bug. As
> of now, there are only two ways that namenode gets modified of a data
> shutdown:
> 1) Shutdown the datanode directly from namenode (mark the node as
> discarded)
> 2) Heartbeat timeout for datanode. The current default value is 10 minutes.
> workaround will be to make value of "dfs.heartbeat.recheck.interval"
> configurable by Juju.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1336309
>
> Title:
> Scaling works in 1 direction
>
> Status in “hadoop” package in Juju Charms Collection:
> New
>
> Bug description:
> hadoop reconfigures properly for scale-up operation.
>
> When scaling down, the administrative interface still shows the
> maximum amount of nodes registered. eg: scale up to 4, it displays 4.
> Scale back down to 2, it still shows 4.
>
> We need to handle the cluster reconfiguration in the -broken /
> -departed hooks to properly scale hadoop back down.
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/charms/+source/hadoop/+bug/1336309/+subscriptions
>