Bug #1737776 “HAproxy is restarted every 5-10 minutes when using...” : Bugs : Gnocchi Charm

Revision history for this message

Tytus Kurek (tkurek) wrote on 2017-12-12:

#1

logs.tar.gz Edit (254.6 KiB, application/x-tar)

Revision history for this message

Tytus Kurek (tkurek) wrote on 2017-12-12:

#2

This issue was originally reported in bug 1698814 and is claimed to be fixed in designate charm 17.08 release.

Revision history for this message

Tytus Kurek (tkurek) wrote on 2017-12-18:

#3

Perhaps, the "ha-relation-changed" hook should not be run at all. I've just captured the following Juju logs from another environment where the issue is also present:

# cat /var/log/juju/logsink.log | grep "got a relation units change" | grep unit-designate-hacluster-63
a5178681-c521-4519-8826-cfeed5baab69: unit-designate-hacluster-63 2017-12-18 11:22:31 DEBUG juju.worker.uniter.remotestate watcher.go:416 got a relation units change: {507 {map[designate/9:{3833}] []}}
a5178681-c521-4519-8826-cfeed5baab69: unit-designate-hacluster-63 2017-12-18 11:28:16 DEBUG juju.worker.uniter.remotestate watcher.go:416 got a relation units change: {507 {map[designate/9:{3834}] []}}
a5178681-c521-4519-8826-cfeed5baab69: unit-designate-hacluster-63 2017-12-18 11:38:38 DEBUG juju.worker.uniter.remotestate watcher.go:416 got a relation units change: {507 {map[designate/9:{3835}] []}}
a5178681-c521-4519-8826-cfeed5baab69: unit-designate-hacluster-63 2017-12-18 11:44:19 DEBUG juju.worker.uniter.remotestate watcher.go:416 got a relation units change: {507 {map[designate/9:{3836}] []}}
a5178681-c521-4519-8826-cfeed5baab69: unit-designate-hacluster-63 2017-12-18 11:59:35 DEBUG juju.worker.uniter.remotestate watcher.go:416 got a relation units change: {507 {map[designate/9:{3837}] []}}
a5178681-c521-4519-8826-cfeed5baab69: unit-designate-hacluster-63 2017-12-18 12:16:12 DEBUG juju.worker.uniter.remotestate watcher.go:416 got a relation units change: {507 {map[designate/9:{3838}] []}}
a5178681-c521-4519-8826-cfeed5baab69: unit-designate-hacluster-63 2017-12-18 12:18:37 DEBUG juju.worker.uniter.remotestate watcher.go:416 got a relation units change: {507 {map[designate/9:{3839}] []}}
a5178681-c521-4519-8826-cfeed5baab69: unit-designate-hacluster-63 2017-12-18 12:23:23 DEBUG juju.worker.uniter.remotestate watcher.go:416 got a relation units change: {507 {map[designate/9:{3840}] []}}

Revision history for this message

Liam Young (gnuoy) wrote on 2018-01-03:

#4

I'll see if I can reproduce this.

Changed in charm-designate:
assignee:	nobody → Liam Young (gnuoy)

Revision history for this message

Liam Young (gnuoy) wrote on 2018-01-03:

#5

The string representations of the resource_params and resources dictionaries presented by designate are continually changing order which is causing the ha-relation-changed hook to fire. Below is the raw relation data from a hacluster subordinate:

14:08:57
clones: '{''cl_res_designate_haproxy'': ''res_designate_haproxy''}'
corosync_bindiface: eth0
corosync_mcastport: "4440"
groups: '{''grp_designate_vips'': ''res_designate_ens3_vip''}'
init_services: '[''haproxy'', ''haproxy'']'
private-address: 10.5.0.8
res_designate_ens3_vip: ocf:heartbeat:IPaddr2
res_designate_haproxy: lsb:haproxy
resource_params: '{''res_designate_ens3_vip'': '' params ip="10.5.100.10" nic="ens3"
cidr_netmask="255.255.0.0"'', ''res_designate_haproxy'': '' op monitor interval="5s"''}'
resources: '{''res_designate_ens3_vip'': ''ocf:heartbeat:IPaddr2'', ''res_designate_haproxy'':
''lsb:haproxy''}'

14:11:35
clones: '{''cl_res_designate_haproxy'': ''res_designate_haproxy''}'
corosync_bindiface: eth0
corosync_mcastport: "4440"
groups: '{''grp_designate_vips'': ''res_designate_ens3_vip''}'
init_services: '[''haproxy'', ''haproxy'']'
private-address: 10.5.0.8
res_designate_ens3_vip: ocf:heartbeat:IPaddr2
res_designate_haproxy: lsb:haproxy
resource_params: '{''res_designate_ens3_vip'': '' params ip="10.5.100.10" nic="ens3"
cidr_netmask="255.255.0.0"'', ''res_designate_haproxy'': '' op monitor interval="5s"''}'
resources: '{''res_designate_ens3_vip'': ''ocf:heartbeat:IPaddr2'', ''res_designate_haproxy'':
''lsb:haproxy''}'

Changed in charm-designate:
status:	New → Confirmed

Revision history for this message

Liam Young (gnuoy) wrote on 2018-01-03:

#6

Argh, the second entry should have been:

14:11:35

clones: '{''cl_res_designate_haproxy'': ''res_designate_haproxy''}'
corosync_bindiface: eth0
corosync_mcastport: "4440"
groups: '{''grp_designate_vips'': ''res_designate_ens3_vip''}'
init_services: '[''haproxy'']'
private-address: 10.5.0.8
res_designate_ens3_vip: ocf:heartbeat:IPaddr2
res_designate_haproxy: lsb:haproxy
resource_params: '{''res_designate_haproxy'': '' op monitor interval="5s"'', ''res_designate_ens3_vip'':
'' params ip="10.5.100.10" nic="ens3" cidr_netmask="255.255.0.0"''}'
resources: '{''res_designate_haproxy'': ''lsb:haproxy'', ''res_designate_ens3_vip'':
''ocf:heartbeat:IPaddr2''}'

Changed in charm-designate:
importance:	Undecided → High
milestone:	none → 18.02

Revision history for this message

Tytus Kurek (tkurek) wrote on 2018-01-18:

#7

This bug is actually causing more problems than I initially thought. It looks like the Pacemaker resource action (res_designate_haproxy:2_monitor_5000) fails when executed at the same time haproxy is being restarted causing LRMD termination:

Jan 17 20:47:57 juju-a2afdd-25-lxd-16 crmd[12508]: notice: te_rsc_command: Initiating action 13: monitor res_designate_haproxy:2_monitor_5000 on juju-a2afdd-25-lxd-16 (local)
Jan 17 20:47:57 juju-a2afdd-25-lxd-16 lrmd[12507]: error: crm_abort: operation_finished: Triggered fatal assert at services_linux.c:269 : op->pid == pid
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 pacemakerd[13843]: error: child_waitpid: Managed process 12507 (lrmd) dumped core
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 pacemakerd[13843]: notice: pcmk_child_exit: Child process lrmd terminated with signal 6 (pid=12507, core=1)
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 pacemakerd[13843]: error: pcmk_process_exit: Child respawn count exceeded by lrmd
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]: error: internal_ipc_get_reply: Server disconnected client lrmd while waiting for msg id 9691
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]: notice: crm_ipc_send: Connection to lrmd closed: Transport endpoint is not connected (-107)
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]: error: lrmd_send_command: Couldn't perform lrmd_rsc_info operation (timeout=0): -107: Transport endpoint is not connected (107)
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]: error: lrmd_send_command: LRMD disconnected
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]: notice: do_lrm_invoke: Not creating resource for a delete event: (null)
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]: warning: qb_ipcs_event_sendv: new_event_notification (12508-30736-14): Broken pipe (32)
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]: error: crm_ipc_read: Connection to lrmd failed
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]: error: mainloop_gio_callback: Connection to lrmd[0x55b22e68bbb0] closed (I/O condition=25)
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]: crit: lrm_connection_destroy: LRM Connection failed

This results with the node going offline in "crm status".

This bug is actually causing more problems than I initially thought. It looks like the Pacemaker resource action (res_designate_haproxy:2_monitor_5000) fails when executed at the same time haproxy is being restarted causing LRMD termination:

Jan 17 20:47:57 juju-a2afdd-25-lxd-16 crmd[12508]:   notice: te_rsc_command: Initiating action 13: monitor res_designate_haproxy:2_monitor_5000 on juju-a2afdd-25-lxd-16 (local)
Jan 17 20:47:57 juju-a2afdd-25-lxd-16 lrmd[12507]:    error: crm_abort: operation_finished: Triggered fatal assert at services_linux.c:269 : op->pid == pid
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 pacemakerd[13843]:    error: child_waitpid: Managed process 12507 (lrmd) dumped core
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 pacemakerd[13843]:   notice: pcmk_child_exit: Child process lrmd terminated with signal 6 (pid=12507, core=1)
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 pacemakerd[13843]:    error: pcmk_process_exit: Child respawn count exceeded by lrmd
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]:    error: internal_ipc_get_reply: Server disconnected client lrmd while waiting for msg id 9691
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]:   notice: crm_ipc_send: Connection to lrmd closed: Transport endpoint is not connected (-107)
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]:    error: lrmd_send_command: Couldn't perform lrmd_rsc_info operation (timeout=0): -107: Transport endpoint is not connected (107)
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]:    error: lrmd_send_command: LRMD disconnected
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]:   notice: do_lrm_invoke: Not creating resource for a delete event: (null)
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]:  warning: qb_ipcs_event_sendv: new_event_notification (12508-30736-14): Broken pipe (32)
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]:    error: crm_ipc_read: Connection to lrmd failed
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]:    error: mainloop_gio_callback: Connection to lrmd[0x55b22e68bbb0] closed (I/O condition=25)
Jan 17 20:47:58 juju-a2afdd-25-lxd-16 crmd[12508]:     crit: lrm_connection_destroy: LRM Connection failed

This results with the node going offline in "crm status".

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-02-08:

#8

This bug has been subscribed to the Canonical "Field High" list.

Revision history for this message

Chris Gregan (cgregan) wrote on 2018-02-12:

#9

@Jason
Do we see this using next charms? OS team believes they have put fixes in for this for 18.02

Revision history for this message

James Page (james-page) wrote on 2018-02-15:

#10

OK I think I managed to grab a diff from a single unit gnocchi deployment:

--- /home/ubuntu/haproxy.cfg 2018-02-15 18:21:16.449872354 +0000
+++ /etc/haproxy/haproxy.cfg 2018-02-15 18:21:24.301899854 +0000
@@ -29,17 +29,17 @@ listen stats
frontend tcp-in_gnocchi-api_admin
     bind *:8041
     bind :::8041
- acl net_172.20.5.86 dst 172.20.5.86/255.255.255.0
- use_backend gnocchi-api_admin_172.20.5.86 if net_172.20.5.86
     acl net_172.20.2.69 dst 172.20.2.69/255.255.255.0
     use_backend gnocchi-api_admin_172.20.2.69 if net_172.20.2.69
+ acl net_172.20.5.86 dst 172.20.5.86/255.255.255.0
+ use_backend gnocchi-api_admin_172.20.5.86 if net_172.20.5.86
     default_backend gnocchi-api_admin_172.20.2.69

-backend gnocchi-api_admin_172.20.5.86
- balance leastconn
- server gnocchi-0 172.20.5.86:8031 check
-
backend gnocchi-api_admin_172.20.2.69
balance leastconn
server gnocchi-0 172.20.2.69:8031 check

+backend gnocchi-api_admin_172.20.5.86
+ balance leastconn
+ server gnocchi-0 172.20.5.86:8031 check
+

This has the original fix for bug 1698814 but the order of the backends still keeps switching.

Changed in charm-gnocchi:
status:	New → Confirmed
importance:	Undecided → High

Jason Hobbs (jason-hobbs) on 2018-02-15

summary:	HAproxy is restarted every 5-10 minutes when using clustered designate - application from 17.11 charm release + application
summary:	- HAproxy is restarted every 5-10 minutes when using clustered designate - application + HAproxy is restarted every 5-10 minutes when using designate application

Revision history for this message

James Page (james-page) wrote on 2018-02-16: Re: HAproxy is restarted every 5-10 minutes when using designate application

#11

https://review.openstack.org/#/c/545101

Changed in charm-gnocchi:
assignee:	nobody → James Page (james-page)
Changed in charms.openstack:
assignee:	nobody → James Page (james-page)
Changed in charm-gnocchi:
milestone:	none → 18.02
Changed in charms.openstack:
status:	New → In Progress
importance:	Undecided → High

Revision history for this message

James Page (james-page) wrote on 2018-02-16:

#12

Note that the combination of {clustered|single unit} + multiple network spaces causes this issue.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-02-16: Fix merged to charms.openstack (master)

#13

Reviewed: https://review.openstack.org/545101
Committed: https://git.openstack.org/cgit/openstack/charms.openstack/commit/?id=af9a92f189e0b5c515d2bcae6bdec60957d7bc73
Submitter: Zuul
Branch: master

commit af9a92f189e0b5c515d2bcae6bdec60957d7bc73
Author: James Page <email address hidden>
Date: Thu Feb 15 19:36:38 2018 +0000

Further use of ordered dicts.

    For consistent rendering of the haproxy template under Python 3,
    ensure that ordered dictionaries are used so that dict iteration
    always results in the same template output.

Change-Id: I119027810e7f7976c794ff7f34a902d256f8963d
Closes-Bug: 1737776

Changed in charms.openstack:
status:	In Progress → Fix Released

Revision history for this message

James Page (james-page) wrote on 2018-02-16: Re: HAproxy is restarted every 5-10 minutes when using designate application

#14

I've pushed rebuilds for gnocchi and designate with the proposed fix to:

cs:~james-page/gnocchi
cs:~james-page/designate

Testing now.

Changed in charm-gnocchi:
status:	Confirmed → In Progress

Revision history for this message

James Page (james-page) wrote on 2018-02-16:

#15

The residual issue with hacluster charm restarting things will be fixed by the latest commit on the hacluster interface which switches any consuming charms to use the new json formatted keys which will always produce the same data on the relation, avoiding hook executions.

summary:

- HAproxy is restarted every 5-10 minutes when using designate application
+ HAproxy is restarted every 5-10 minutes when using reactive charms with
+ multiple network space bindings

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2018-02-23:

#16

Rebuilds for gnocchi and designate charms have been published to cs: next, flagging fix-committed.

Changed in charm-designate:
status:	Confirmed → In Progress
status:	In Progress → Fix Committed
Changed in charm-gnocchi:
status:	In Progress → Fix Committed

Revision history for this message

Tytus Kurek (tkurek) wrote on 2018-02-26:

#17

logs.tar.gz Edit (820.5 KiB, application/x-tar)

To be honest, I don't see the issue being resolved after upgrading affected designate installation to the following version: cs:~openstack-charmers-next/designate-68

root@juju-baab69-29-lxd-8:~# cat /var/log/juju/unit-designate-hacluster-63.log | grep "Stopping haproxy" | tail -n 5
2018-02-26 14:40:09 DEBUG ha-relation-changed * Stopping haproxy haproxy
2018-02-26 14:40:34 DEBUG ha-relation-changed * Stopping haproxy haproxy
2018-02-26 14:43:51 DEBUG ha-relation-changed * Stopping haproxy haproxy
2018-02-26 14:48:51 DEBUG ha-relation-changed * Stopping haproxy haproxy
2018-02-26 14:54:07 DEBUG ha-relation-changed * Stopping haproxy haproxy

Attaching log files from all 3 units.

Revision history for this message

Tytus Kurek (tkurek) wrote on 2018-02-27:

#18

relation-get Edit (58.5 KiB, text/plain)

Download full text (3.6 KiB)

OK, so it looks like there's one more issue which causes the "ha-relation-changed" hook being run: the "json_init_services" keeps being populated with "haproxy" values. Please take a look on the following tries:

TRY 1:

root@juju-baab69-29-lxd-8:/var/lib/juju/agents/unit-designate-hacluster-63/charm# relation-get -r 507 - designate/9
clones: '{''cl_res_designate_haproxy'': ''res_designate_haproxy''}'
corosync_bindiface: eth0
corosync_mcastport: "4440"
groups: '{''grp_designate_vips'': ''res_designate_eth0_vip''}'
init_services: '[''haproxy'']'
json_clones: '{"cl_res_designate_haproxy": "res_designate_haproxy"}'
json_groups: '{"grp_designate_vips": "res_designate_eth0_vip"}'
json_init_services: '["haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",
  "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",
...
  "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",
  "haproxy", "haproxy", "haproxy", "haproxy", "haproxy"]'
json_res_designate_eth0_vip: '"ocf:heartbeat:IPaddr2"'
json_res_designate_haproxy: '"lsb:haproxy"'
json_resource_params: '{"res_designate_eth0_vip": " params ip=\"100.86.0.11\" nic=\"eth0\"
  cidr_netmask=\"255.255.240.0\"", "res_designate_haproxy": " op monitor interval=\"5s\""}'
json_resources: '{"res_designate_eth0_vip": "ocf:heartbeat:IPaddr2", "res_designate_haproxy":
  "lsb:haproxy"}'
private-address: 100.107.2.78
res_designate_eth0_vip: ocf:heartbeat:IPaddr2
res_designate_haproxy: lsb:haproxy
resource_params: '{''res_designate_haproxy'': '' op monitor interval="5s"'', ''res_designate_eth0_vip'':
  '' params ip="100.86.0.11" nic="eth0" cidr_netmask="255.255.240.0"''}'
resources: '{''res_designate_haproxy'': ''lsb:haproxy'', ''res_designate_eth0_vip'':
  ''ocf:heartbeat:IPaddr2''}'

TRY 2:

root@juju-baab69-29-lxd-8:/var/lib/juju/agents/unit-designate-hacluster-63/charm# relation-get -r 507 - designate/9
clones: '{''cl_res_designate_haproxy'': ''res_designate_haproxy''}'
corosync_bindiface: eth0
corosync_mcastport: "4440"
groups: '{''grp_designate_vips'': ''res_designate_eth0_vip''}'
init_services: '[''haproxy'']'
json_clones: '{"cl_res_designate_haproxy": "res_designate_haproxy"}'
json_groups: '{"grp_designate_vips": "res_designate_eth0_vip"}'
json_init_services: '["haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",
  "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",
...
  "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",
  "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy"]'
json_res_designate_eth0_vip: '"ocf:heartbeat:IPaddr2"'
json_res_designate_haproxy: '"lsb:haproxy"'
json_resource_params: '{"res_designate_eth0_vip": " params ip=\"100.86.0.11\" nic=\"eth0\"
  cidr_netmask=\"255.255.240.0\"", "res_designate_haproxy": " op monitor interval=\"5s\""}'
json_resources: '{"res_designate_eth0_vip": "ocf:heartbeat:IPaddr2", "res_designate_haproxy":
  "lsb:haproxy"}'
private-address: 100.107.2.78
res_designate_eth0_vip: ocf:heartbeat:IPaddr2
res_designate_haproxy: lsb:haproxy
resource_params: '{''res_designate_haproxy'': '' op ...

OK, so it looks like there's one more issue which causes the "ha-relation-changed" hook being run: the "json_init_services" keeps being populated with "haproxy" values. Please take a look on the following tries:

TRY 1:

root@juju-baab69-29-lxd-8:/var/lib/juju/agents/unit-designate-hacluster-63/charm# relation-get -r 507 - designate/9
clones: '{''cl_res_designate_haproxy'': ''res_designate_haproxy''}'
corosync_bindiface: eth0
corosync_mcastport: "4440"
groups: '{''grp_designate_vips'': ''res_designate_eth0_vip''}'
init_services: '[''haproxy'']'
json_clones: '{"cl_res_designate_haproxy": "res_designate_haproxy"}'
json_groups: '{"grp_designate_vips": "res_designate_eth0_vip"}'
json_init_services: '["haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",
  "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",
...
  "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",
  "haproxy", "haproxy", "haproxy", "haproxy", "haproxy"]'
json_res_designate_eth0_vip: '"ocf:heartbeat:IPaddr2"'
json_res_designate_haproxy: '"lsb:haproxy"'
json_resource_params: '{"res_designate_eth0_vip": "  params ip=\"100.86.0.11\" nic=\"eth0\"
  cidr_netmask=\"255.255.240.0\"", "res_designate_haproxy": "  op monitor interval=\"5s\""}'
json_resources: '{"res_designate_eth0_vip": "ocf:heartbeat:IPaddr2", "res_designate_haproxy":
  "lsb:haproxy"}'
private-address: 100.107.2.78
res_designate_eth0_vip: ocf:heartbeat:IPaddr2
res_designate_haproxy: lsb:haproxy
resource_params: '{''res_designate_haproxy'': ''  op monitor interval="5s"'', ''res_designate_eth0_vip'':
  ''  params ip="100.86.0.11" nic="eth0" cidr_netmask="255.255.240.0"''}'
resources: '{''res_designate_haproxy'': ''lsb:haproxy'', ''res_designate_eth0_vip'':
  ''ocf:heartbeat:IPaddr2''}'

TRY 2:

root@juju-baab69-29-lxd-8:/var/lib/juju/agents/unit-designate-hacluster-63/charm# relation-get -r 507 - designate/9
clones: '{''cl_res_designate_haproxy'': ''res_designate_haproxy''}'
corosync_bindiface: eth0
corosync_mcastport: "4440"
groups: '{''grp_designate_vips'': ''res_designate_eth0_vip''}'
init_services: '[''haproxy'']'
json_clones: '{"cl_res_designate_haproxy": "res_designate_haproxy"}'
json_groups: '{"grp_designate_vips": "res_designate_eth0_vip"}'
json_init_services: '["haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",
  "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",
...
  "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",
  "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy"]'
json_res_designate_eth0_vip: '"ocf:heartbeat:IPaddr2"'
json_res_designate_haproxy: '"lsb:haproxy"'
json_resource_params: '{"res_designate_eth0_vip": "  params ip=\"100.86.0.11\" nic=\"eth0\"
  cidr_netmask=\"255.255.240.0\"", "res_designate_haproxy": "  op monitor interval=\"5s\""}'
json_resources: '{"res_designate_eth0_vip": "ocf:heartbeat:IPaddr2", "res_designate_haproxy":
  "lsb:haproxy"}'
private-address: 100.107.2.78
res_designate_eth0_vip: ocf:heartbeat:IPaddr2
res_designate_haproxy: lsb:haproxy
resource_params: '{''res_designate_haproxy'': ''  op monitor interval="5s"'', ''res_designate_eth0_vip'':
  ''  params ip="100.86.0.11" nic="eth0" cidr_netmask="255.255.240.0"''}'
resources: '{''res_designate_haproxy'': ''lsb:haproxy'', ''res_designate_eth0_vip'':
  ''ocf:heartbeat:IPaddr2''}'

Dots ("...") were used to indicate additional lines like the following one:

"haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy", "haproxy",

At the moment in the environment I upgraded there are hundreds of them. Attached is the full output.

Revision history for this message

Tytus Kurek (tkurek) wrote on 2018-02-27:

#19

I can confirm that I see the same on all four environments which I own at the moment. The issue is also present on fresh LXD-based installation. The "json_init_services" parameter keeps being populated with "haproxy" values.

Revision history for this message

James Page (james-page) wrote on 2018-02-27:

#20

I think this is a bug in the hacluster interface

Revision history for this message

Tytus Kurek (tkurek) wrote on 2018-03-01:

#21

James,

Should I raise separate bug report for the issue with "json_init_services"?

Revision history for this message

Tytus Kurek (tkurek) wrote on 2018-03-05:

#22

I have a quick patch for the "json_init_services" parameter being filled in with the "haproxy" values:

$ diff charm-interface-hacluster/common.py charm-interface-hacluster/common_patched.py
320c320,322
< self['init_services'].extend(resources)
---
> for resource in resources:
> if resource not in self['init_services']:
> self['init_services'].extend(resource)

but I still have the following concerns:

1) Should "json_*" parameters be present at all?
2) Should "cluster_connected" function from "charm-designate/reactive/designate_handlers.py" be run continuously which is the actual root cause of the "haproxy" values being populated?

I am happy to continue working on the patch. I am happy to create a new bug report. Just please let me know what the directions should be.

Revision history for this message

Jason Hobbs (jason-hobbs) wrote on 2018-03-07:

#23

I filed bug 1754149, which may be what Tytus is seeing in recent runs; I haven't seen it live, only in a CI run.

Ryan Beisner (1chb1n) on 2018-03-09

Changed in charm-designate:
status:	Fix Committed → Fix Released
Changed in charm-gnocchi:
status:	Fix Committed → Fix Released

Gnocchi Charm

HAproxy is restarted every 5-10 minutes when using reactive charms with multiple network space bindings

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

	Status	Importance	Assigned to	Milestone
Gnocchi Charm	Fix Released	High	James Page	Gnocchi Charm 18.02
OpenStack Designate Charm	Fix Released	High	Liam Young	OpenStack Designate Charm 18.02
charms.openstack	Fix Released	High	James Page