Activity log for bug #1826297

Date Who What changed Old value New value Message
2019-04-25 01:26:32 james beedy bug added bug
2019-04-25 01:48:08 james beedy description When multiple units try and pull a resource(s) at once everything seems to lock up, some units are able to get the resource, and some fail pulling it from the controller. I have worked around this in my spark charm to some degree by putting units that can't get the resource in a blocked state and have them naturally retry again when its their time. This ends up working itself out and all of my units end up eventually getting the resource, but its for sure an extreme hack. This can be reproduced by running the following command: juju deploy cs:~omnivector/spark --constraints "instance-type=t3.medium" -n 10 Exhibited in this juju show here https://youtu.be/lirfA5a9Xik?t=1351 When multiple units try and pull a resource(s) at once everything seems to lock up, some units are able to get the resource, and some fail pulling it from the controller. I have worked around this in my spark charm to some degree by putting units that can't get the resource in a blocked state and have them naturally retry again when its their time. This ends up working itself out e.g. all of my units end up eventually getting the resource, but its for sure an extreme hack. This can be reproduced by running the following command: juju deploy cs:~omnivector/spark --constraints "instance-type=t3.medium" -n 10 Exhibited in this juju show here https://youtu.be/lirfA5a9Xik?t=1351
2019-04-25 02:08:38 james beedy description When multiple units try and pull a resource(s) at once everything seems to lock up, some units are able to get the resource, and some fail pulling it from the controller. I have worked around this in my spark charm to some degree by putting units that can't get the resource in a blocked state and have them naturally retry again when its their time. This ends up working itself out e.g. all of my units end up eventually getting the resource, but its for sure an extreme hack. This can be reproduced by running the following command: juju deploy cs:~omnivector/spark --constraints "instance-type=t3.medium" -n 10 Exhibited in this juju show here https://youtu.be/lirfA5a9Xik?t=1351 When multiple units try and pull a resource(s) at once everything seems to lock up, some units are able to get the resource, and some fail pulling it from the controller. I have worked around this in my spark charm to some degree by putting units that can't get the resource in a blocked state and have them naturally retry again when its their time. This ends up working itself out e.g. all of my units end up eventually getting the resource, but its for sure an extreme hack. This can be reproduced by running the following command: juju deploy cs:~omnivector/spark --constraints "instance-type=t3.medium" -n 10 Exhibited in this juju show here https://youtu.be/lirfA5a9Xik?t=1351 The charm code that accounts for the block and return demented spin lock mechanism is here https://github.com/omnivector-solutions/layer-spark-base/blob/master/lib/charms/layer/spark_base.py#L27,L30 and here https://github.com/omnivector-solutions/layer-spark-base/blob/master/reactive/spark_base.py#L76,L81 similarly for layer-hadoop-base, https://github.com/omnivector-solutions/layer-hadoop-base/blob/master/lib/charms/layer/hadoop_base.py#L25,L28 and https://github.com/omnivector-solutions/layer-hadoop-base/blob/master/lib/charms/layer/hadoop_base.py#L24,L28
2019-04-25 02:09:38 james beedy description When multiple units try and pull a resource(s) at once everything seems to lock up, some units are able to get the resource, and some fail pulling it from the controller. I have worked around this in my spark charm to some degree by putting units that can't get the resource in a blocked state and have them naturally retry again when its their time. This ends up working itself out e.g. all of my units end up eventually getting the resource, but its for sure an extreme hack. This can be reproduced by running the following command: juju deploy cs:~omnivector/spark --constraints "instance-type=t3.medium" -n 10 Exhibited in this juju show here https://youtu.be/lirfA5a9Xik?t=1351 The charm code that accounts for the block and return demented spin lock mechanism is here https://github.com/omnivector-solutions/layer-spark-base/blob/master/lib/charms/layer/spark_base.py#L27,L30 and here https://github.com/omnivector-solutions/layer-spark-base/blob/master/reactive/spark_base.py#L76,L81 similarly for layer-hadoop-base, https://github.com/omnivector-solutions/layer-hadoop-base/blob/master/lib/charms/layer/hadoop_base.py#L25,L28 and https://github.com/omnivector-solutions/layer-hadoop-base/blob/master/lib/charms/layer/hadoop_base.py#L24,L28 When multiple units try and pull a resource(s) at once everything seems to lock up, some units are able to get the resource, and some fail pulling it from the controller. I have worked around this in my spark charm to some degree by putting units that can't get the resource in a blocked state and have them naturally retry again when its their time. This ends up working itself out e.g. all of my units end up eventually getting the resource, but its for sure an extreme hack. This can be reproduced by running the following command: juju deploy cs:~omnivector/spark --constraints "instance-type=t3.medium" -n 10 Exhibited in this juju show here https://youtu.be/lirfA5a9Xik?t=1351 The charm code that accounts for this demented block and return spin lock mechanism is here https://github.com/omnivector-solutions/layer-spark-base/blob/master/lib/charms/layer/spark_base.py#L27,L30 and here https://github.com/omnivector-solutions/layer-spark-base/blob/master/reactive/spark_base.py#L76,L81 similarly for layer-hadoop-base, https://github.com/omnivector-solutions/layer-hadoop-base/blob/master/lib/charms/layer/hadoop_base.py#L25,L28 and https://github.com/omnivector-solutions/layer-hadoop-base/blob/master/lib/charms/layer/hadoop_base.py#L24,L28
2019-04-29 20:32:01 Richard Harding juju: status New Triaged
2019-04-29 20:32:17 Richard Harding juju: importance Undecided High
2022-11-03 15:23:25 Canonical Juju QA Bot juju: importance High Low
2022-11-03 15:23:26 Canonical Juju QA Bot tags expirebugs-bot