nova-cloud-controller next charms get wrong IPs in haproxy.cfg in HA configuration

Bug #1375631 reported by Brad Marshall
30
This bug affects 4 people
Affects Status Importance Assigned to Milestone
nova-cloud-controller (Juju Charms Collection)
Fix Released
Undecided
Edward Hope-Morley

Bug Description

Using the next charms for nova-cloud-controller deployed into LXCs in a HA configuration, I've found that it uses the wrong IPs in the haproxy config.

It appears to be using a random selection of the right IPs, but having duplicates, and in some cases triplicates of the IPs in there. It is using the same sets of IPs consistantly across each listen stanza for a particular unit.

Here's an extract from one of our deploys:

nova-cloud-controller/0:
listen nova-api-os-compute_ipv4 0.0.0.0:8774
    balance roundrobin
    server nova-cloud-controller-2 x.y.z.103:8764 check
    server nova-cloud-controller-1 x.y.z.103:8764 check
    server nova-cloud-controller-0 x.y.z.103:8764 check

nova-cloud-controller/1:
listen nova-api-os-compute_ipv4 0.0.0.0:8774
    balance roundrobin
    server nova-cloud-controller-2 x.y.z.103:8764 check
    server nova-cloud-controller-1 x.y.z.118:8764 check
    server nova-cloud-controller-0 x.y.z.103:8764 check

nova-cloud-controller/2:
listen nova-api-os-compute_ipv4 0.0.0.0:8774
    balance roundrobin
    server nova-cloud-controller-2 x.y.z.120:8764 check
    server nova-cloud-controller-1 x.y.z.103:8764 check
    server nova-cloud-controller-0 x.y.z.103:8764 check

We are using Ubuntu 14.04 with Openstack icehouse, with r103 of the next nova-cloud-controller charm.

Please let us know if you need any more details.

Related branches

Revision history for this message
James Page (james-page) wrote :

Brad; we should be landing a significant change into the HAProxy templating in the next few days; I'll test locally to check it all works OK.

tags: added: os-next
JuanJo Ciarlante (jjo)
tags: added: canonical-is
JuanJo Ciarlante (jjo)
tags: added: bootstack
James Troup (elmo)
tags: added: canonical-bootstack
removed: bootstack
Revision history for this message
Brad Marshall (brad-marshall) wrote :

Additional debugging info that may be useful:

root@juju-machine-0-lxc-8:~# corosync-cfgtool -s
Printing ring status.
Local node ID 739553097
RING ID 0
    id = x.y.z.73
    status = ring 0 active with no faults

root@juju-machine-0-lxc-8:~# corosync-cmapctl | grep members
runtime.totem.pg.mrp.srp.members.739553097.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.739553097.ip (str) = r(0) ip(x.y.z.73)
runtime.totem.pg.mrp.srp.members.739553097.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.739553097.status (str) = joined
runtime.totem.pg.mrp.srp.members.739553102.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.739553102.ip (str) = r(0) ip(x.y.z.78)
runtime.totem.pg.mrp.srp.members.739553102.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.739553102.status (str) = joined
runtime.totem.pg.mrp.srp.members.739553115.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.739553115.ip (str) = r(0) ip(x.y.z.91)
runtime.totem.pg.mrp.srp.members.739553115.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.739553115.status (str) = joined

root@juju-machine-0-lxc-8:~# corosync-quorumtool -l

Membership information
----------------------
    Nodeid Votes Name
 739553097 1 x-y-z-73.maas (local)
 739553102 1 x-y-z-78.maas
 739553115 1 x-y-z-91.maas

root@juju-machine-0-lxc-8:~# crm configure show
node $id="739553097" juju-machine-0-lxc-8
node $id="739553102" juju-machine-1-lxc-4
node $id="739553115" juju-machine-2-lxc-5
primitive res_nova_eth0_vip ocf:heartbeat:IPaddr2 \
    params ip="x.y.z.105" cidr_netmask="255.255.248.0" nic="eth0"
primitive res_nova_haproxy lsb:haproxy \
    op monitor interval="5s"
group grp_nova_vips res_nova_eth0_vip
clone cl_nova_haproxy res_nova_haproxy
property $id="cib-bootstrap-options" \
    dc-version="1.1.10-42f2063" \
    cluster-infrastructure="corosync" \
    stonith-enabled="false" \
    no-quorum-policy="ignore" \
    last-lrm-refresh="1412148872"
rsc_defaults $id="rsc-options" \
    resource-stickiness="100"

tags: added: openstack
Revision history for this message
Edward Hope-Morley (hopem) wrote :

This could be another case of peer_echo() incorrectly broadcasting the hostname in the cluster relation.

Changed in nova-cloud-controller (Juju Charms Collection):
assignee: nobody → Edward Hope-Morley (hopem)
tags: added: cts
Revision history for this message
Edward Hope-Morley (hopem) wrote :
Revision history for this message
James Page (james-page) wrote :

hdl@c60-vm:~/scale-oct2014/scale-test-oct2014/deployments⟫ juju run --unit nova-cloud-controller/0 "relation-get -r cluster:8 - nova-cloud-controller/0"
dbsync_state: complete
private-address: wnf6q.maas
hdl@c60-vm:~/scale-oct2014/scale-test-oct2014/deployments⟫ juju run --unit nova-cloud-controller/0 "relation-get -r cluster:8 - nova-cloud-controller/1"
dbsync_state: complete
private-address: wnf6q.maas
hdl@c60-vm:~/scale-oct2014/scale-test-oct2014/deployments⟫ juju run --unit nova-cloud-controller/0 "relation-get -r cluster:8 - nova-cloud-controller/1"
dbsync_state: complete
private-address: wnf6q.maas
hdl@c60-vm:~/scale-oct2014/scale-test-oct2014/deployments⟫ juju run --unit nova-cloud-controller/0 "relation-get -r cluster:8 - nova-cloud-controller/2"
dbsync_state: complete
private-address: wnf6q.maas

Changed in nova-cloud-controller (Juju Charms Collection):
status: New → Confirmed
Revision history for this message
Nobuto Murata (nobuto) wrote :

It looks like the fix has been merged and landed on stable already.
http://bazaar.launchpad.net/~openstack-charmers/charms/trusty/nova-cloud-controller/trunk/revision/119

Can we mark this as Fix Released?

Changed in nova-cloud-controller (Juju Charms Collection):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.