Interface resolution should be more intelligent

Bug #1424524 reported by Alex Kang
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack HA Cluster Charm
Triaged
High
James Page
hacluster (Juju Charms Collection)
Invalid
High
James Page

Bug Description

Hi

When I add a relation between openstack-dashboard and hacluster, it has a relation error

======================================================================================================

2015-02-23 02:15:57 DEBUG worker.uniter.jujuc server.go:111 hook context id "hacluster-horizon/0-ha-relation-joined-4072439655114028081"; dir "/var/lib/juju/agents/unit-hacluster-horizon-0/charm"
2015-02-23 02:15:57 INFO worker.uniter.jujuc server.go:110 running hook tool "relation-get" ["--format=json" "-r" "ha:26" "corosync_bindiface" "openstack-dashboard/0"]
2015-02-23 02:15:57 DEBUG worker.uniter.jujuc server.go:111 hook context id "hacluster-horizon/0-ha-relation-joined-4072439655114028081"; dir "/var/lib/juju/agents/unit-hacluster-horizon-0/charm"
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 Traceback (most recent call last):
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 File "/var/lib/juju/agents/unit-hacluster-horizon-0/charm/hooks/ha-relation-joined", line 587, in <module>
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 hooks.execute(sys.argv)
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 File "/var/lib/juju/agents/unit-hacluster-horizon-0/charm/hooks/charmhelpers/core/hookenv.py", line 544, in execute
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 self._hooks[hook_name]()
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 File "/var/lib/juju/agents/unit-hacluster-horizon-0/charm/hooks/ha-relation-joined", line 315, in configure_principle_cluster_resources
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 if not get_corosync_conf():
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 File "/var/lib/juju/agents/unit-hacluster-horizon-0/charm/hooks/ha-relation-joined", line 144, in get_corosync_conf
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 'corosync_bindnetaddr': bindnetaddr(bindiface),
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 File "/var/lib/juju/agents/unit-hacluster-horizon-0/charm/hooks/hacluster.py", line 84, in get_network_address
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 network = "{}/{}".format(get_iface_ipaddr(iface),
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 File "/var/lib/juju/agents/unit-hacluster-horizon-0/charm/hooks/hacluster.py", line 60, in get_iface_ipaddr
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 struct.pack('256s', iface[:15])
2015-02-23 02:15:57 INFO unit.hacluster-horizon/0.ha-relation-joined logger.go:40 IOError: [Errno 19] No such device
======================================================================================================

It seems that it can't get the ha bind network interfaces in the machine.
So I put the debugging code and then get the result which tried to find the wrong interfaces that the machine doesn't have.
(the machine's interface is only eth0)

The history of deployment is as below

1. I set the ha-bindiface='juju-br0' and added a relation between openstack-dashboard and hacluster
2. I got the error as above which means that the machine doesn't have such device in there.
3. I set ha-bindiface='eth0' and tried to resolve unit
4. it still finds the interface (device) as juju-br0 even though I changed device from juju-br0 to eth0

ubuntu@juju-machine-6-lxc-7:/var/lib/juju/agents/unit-hacluster-horizon-0/charm/hooks$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
23: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:16:3e:6d:23:5f brd ff:ff:ff:ff:ff:ff
    inet 10.100.1.82/24 brd 10.100.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe6d:235f/64 scope link
       valid_lft forever preferred_lft forever

Additionally this error occurs here (hacluster.py)
def get_network_address(iface):
    if iface:
        iface = str(iface)
        network = "{}/{}".format(get_iface_ipaddr(iface),
                                 get_netmask_cidr(get_iface_netmask(iface)))
        ip = IPNetwork(network)
        return str(ip.network)
    else:
        return None

I think it can easily reproduce like this
1. set the wrong interface for ha-bindiface in yaml file
2. deploy openstack component with hacluster
3. change the ha-bindiface with exsting interface in the machine
4. check it out

Thanks

Jonathan Davies (jpds)
Changed in hacluster (Juju Charms Collection):
status: New → Confirmed
Revision history for this message
James Page (james-page) wrote :

I suspect that the change in configuration is not getting down to the hacluster subordinate charm, so its probably not possible to resolve this way.

That said, the hacluster charm has direct configuration of corosync_bindiface which overrides anything coming down from the principle charm - try setting that instead and resolved --retry the units:

    juju set hacluster corosync_bindiface=eth0

....

James Page (james-page)
summary: - ha relation falied to get the interface
+ Interface resolution should be more intelligent
Changed in hacluster (Juju Charms Collection):
importance: Undecided → Medium
status: Confirmed → Triaged
Revision history for this message
James Page (james-page) wrote :

I think this comes down to the fact that using interface names in configuration is just a really bad idea, esp since 16.04 release where interface naming will not be consistent in alot of deployments.

I think the default behaviour of the charm should be to use the network interface for which the units private-address is bound - this would map to eth0/juju-br0/ens2 or whatever automatically.

This could then be overridden with a network space binding for juju 2.0, resolving down to an interface as need be, or a configuration option for pre juju 2.0.

Changed in hacluster (Juju Charms Collection):
milestone: none → 16.10
importance: Medium → High
assignee: nobody → James Page (james-page)
James Page (james-page)
Changed in hacluster (Juju Charms Collection):
milestone: 16.10 → 17.01
James Page (james-page)
Changed in charm-hacluster:
assignee: nobody → James Page (james-page)
importance: Undecided → High
status: New → Triaged
Changed in hacluster (Juju Charms Collection):
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.