operations fails on rackspace because of ipv6 address in dns-name

Bug #1624495 reported by Curtis Hovey
38
This bug affects 9 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
Critical
Michael Foord
juju-ci-tools
Won't Fix
Medium
Unassigned

Bug Description

While testing assess_recovery against rackspace, delete_controller_members() failed The call to wait_for_state_server_to_shutdown() returned immediately because it was checking the ipv6 address. The unroutable address came from
    host = machine.info.get('dns-name')

^ The machine does an an ipv6 address, and I do not doubt that some networks will have that as the dns_name. As rackspace is a public cloud, I (and the test) expects a routable ipv4 address.

The test can be updated to look for a routable address by checking the api-endpoins listed by show-controller.

Maybe juju should not have shown the ipv4 address.

Curtis Hovey (sinzui)
summary: - delete_controller_members() fails on racksapce because of ipv6 address
+ delete_controller_members() fails on rackspace because of ipv6 address
Curtis Hovey (sinzui)
summary: - delete_controller_members() fails on rackspace because of ipv6 address
+ operations fails on rackspace because of ipv6 address in dns-name
Curtis Hovey (sinzui)
tags: added: rackspace status
Changed in juju:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Richard Harding (rharding)
milestone: none → 2.0-rc2
Revision history for this message
Michael Foord (mfoord) wrote :

Where you say "Maybe juju should not have shown the ipv4 address" I assume you mean "should not have shown the ipv6 address"?

Michael Foord (mfoord)
Changed in juju:
status: Triaged → In Progress
assignee: Richard Harding (rharding) → Michael Foord (mfoord)
Changed in juju:
importance: High → Critical
Revision history for this message
Michael Foord (mfoord) wrote :

dns-name comes from the PreferredPublicAddress in the machine record. The logic for choosing a preferred address needs to prefer ipv4 over ipv6.

Revision history for this message
Michael Foord (mfoord) wrote :

Confirmed that network.SelectPublicAddress will return an ipv6 address if it appears in preference to an ipv4 one. In the sort order logic there is code *meant* to weight ipv4 more heavily, but it obviously isn't working.

Michael Foord (mfoord)
Changed in juju:
status: In Progress → Fix Committed
Revision history for this message
Curtis Hovey (sinzui) wrote :

Thank you. I can confirm that we are seeing the correct address.

Revision history for this message
Larry Michel (lmic) wrote :
Download full text (6.3 KiB)

I have tested the rc2 snap and the Xenial VMs are still showing IPV6 addresses:

$ /snap/bin/juju status
MODEL CONTROLLER CLOUD/REGION VERSION
default vspherecontroller-rc2 vsphere/dc0 2.0-rc2.1

APP VERSION STATUS SCALE CHARM STORE REV OS NOTES
easyrsa 3.0.1 active 1 easyrsa jujucharms 2 ubuntu
elasticsearch active 2 elasticsearch jujucharms 19 ubuntu
etcd 2.2.5 error 3 etcd jujucharms 13 ubuntu
filebeat active 4 filebeat jujucharms 5 ubuntu
flannel 0.6.1 active 4 flannel jujucharms 2 ubuntu
kibana active 1 kibana jujucharms 15 ubuntu
kubeapi-load-balancer 1.10.0 active 1 kubeapi-load-balancer jujucharms 2 ubuntu exposed
kubernetes-master 1.4.0 active 1 kubernetes-master jujucharms 2 ubuntu
kubernetes-worker 1.4.0 active 3 kubernetes-worker jujucharms 2 ubuntu exposed
topbeat active 3 topbeat jujucharms 5 ubuntu

UNIT WORKLOAD AGENT MACHINE PUBLIC-ADDRESS PORTS MESSAGE
easyrsa/0 active idle 0 fe80::1 Certificate Authority connected.
elasticsearch/0 active idle 1 10.245.62.35 9200/tcp Ready
elasticsearch/1 active idle 2 10.245.62.36 9200/tcp Ready
etcd/0 error idle 3 fe80::1 2379/tcp hook failed: "certificates-relation-changed"
etcd/1 active idle 4 fe80::1 2379/tcp Healthy with 2 known peers.
etcd/2 maintenance idle 5 fe80::1 Installing etcd from apt.
kibana/0 active idle 6 10.245.62.40 80/tcp,9200/tcp ready
kubeapi-load-balancer/0 active idle 7 fe80::1 443/tcp Loadbalancer ready.
kubernetes-master/0 active idle 8 fe80::1 6443/tcp Kubernetes master running.
  filebeat/0 active idle fe80::1 Filebeat ready.
  flannel/0 active idle fe80::1 Flannel subnet 10.1.59.1/24
kubernetes-worker/0 active idle 9 fe80::1 80/tcp,443/tcp Kubernetes worker running.
  filebeat/3 active idle fe80::1 Filebeat ready.
  flannel/3 active idle fe80::1 Flannel subnet 10.1.3.1/24
  topbeat/2 active idle fe80::1 Topbeat ready.
kubernetes-worker/1 active idle 10 fe80::1 80/tcp,443/tcp Kubernetes worker running.
  filebeat/1 active idle fe80::1 ...

Read more...

Changed in juju:
status: Fix Committed → New
Changed in juju:
status: New → Triaged
Revision history for this message
Larry Michel (lmic) wrote :

The previous recreate was with the wrong version of the snap, 173. However, I am still able to recreate with 176 version of the snap which has the fix:

juju 2.0-edge 176 juju devmode

MACHINE STATE DNS INS-ID SERIES AZ
0 started fe80::1 juju-33e7b5-0 xenial
1 started 10.245.62.49 juju-33e7b5-1 trusty
2 started 10.245.62.50 juju-33e7b5-2 trusty
3 started fe80::1 juju-33e7b5-3 xenial
4 started fe80::1 juju-33e7b5-4 xenial
5 started fe80::1 juju-33e7b5-5 xenial
6 started 10.245.62.56 juju-33e7b5-6 trusty
7 started fe80::1 juju-33e7b5-7 xenial
8 started fe80::1 juju-33e7b5-8 xenial
9 started fe80::1 juju-33e7b5-9 xenial
10 started fe80::1 juju-33e7b5-10 xenial
11 started fe80::1 juju-33e7b5-11 xenial

Revision history for this message
Larry Michel (lmic) wrote :

What's interesting about this case is that the Xenial VMs are actually showing multiple IPV6 addresses when I do a show all addresses in vCenter. The list contains a loopback address: fe80::1.

Trusty only shows one IPV4 and IPV6 addresses and juju always selects the IPV4 address.

For Xenial VMs, it's the loopback address that's being selected for each VM. So this may be different or a side effect that's not yet unaccounted for.

Revision history for this message
Larry Michel (lmic) wrote :

correction in previous comment: that's not yet accounted* for

Curtis Hovey (sinzui)
Changed in juju:
milestone: 2.0-rc2 → none
Revision history for this message
Michael Foord (mfoord) wrote :

I think this is a different issue - perhaps we're incorrectly recognising fe80 as public instead of cloud local.

Revision history for this message
Michael Foord (mfoord) wrote :

Nope, we correctly recognise that fe80 is a link-local address. I wonder if those machines don't have any public addresses. This bug should go back to fix-committed and a new bug should be opened.

Curtis Hovey (sinzui)
Changed in juju:
status: Triaged → Fix Committed
Changed in juju-ci-tools:
status: Triaged → Won't Fix
Curtis Hovey (sinzui)
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.