[2.1 beta4] LXD container assigned IP address on LXD IP range instead of IP from MAAS DHCP

Bug #1657831 reported by Larry Michel
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Incomplete
High
Unassigned

Bug Description

I deployed openstack-base on 3 arm servers and an amd64 machine. On one of the arm nodes, one of 3 LXD containers was configured with an IP address which came from LXD IP default range. The other 2 containers were assigned MAAS dhcp IP addresses as expected as were all other containers on remaining hosts in the deployment:

$ juju status
Model Controller Cloud/Region Version
default mycontroller larry 2.1-beta4

App Version Status Scale Charm Store Rev OS Notes
ceph-mon 10.2.3 active 3 ceph-mon jujucharms 6 ubuntu
ceph-osd 10.2.3 active 3 ceph-osd jujucharms 238 ubuntu
ceph-radosgw 10.2.3 active 1 ceph-radosgw jujucharms 245 ubuntu
cinder 9.0.0 active 1 cinder jujucharms 257 ubuntu
cinder-ceph 9.0.0 active 1 cinder-ceph jujucharms 221 ubuntu
glance 13.0.0 active 1 glance jujucharms 253 ubuntu
keystone 10.0.0 active 1 keystone jujucharms 258 ubuntu
mysql 5.6.21-25.8 active 1 percona-cluster jujucharms 246 ubuntu
neutron-api 9.0.0 active 1 neutron-api jujucharms 246 ubuntu
neutron-gateway 9.0.0 active 1 neutron-gateway jujucharms 232 ubuntu
neutron-openvswitch 9.0.0 active 3 neutron-openvswitch jujucharms 238 ubuntu
nova-cloud-controller 14.0.1 maintenance 1 nova-cloud-controller jujucharms 292 ubuntu
nova-compute 14.0.1 active 3 nova-compute jujucharms 259 ubuntu
ntp unknown 4 ntp jujucharms 0 ubuntu
openstack-dashboard 10.0.0 active 1 openstack-dashboard jujucharms 243 ubuntu
rabbitmq-server 3.5.7 active 1 rabbitmq-server jujucharms 54 ubuntu

Unit Workload Agent Machine Public address Ports Message
ceph-mon/0* active idle 1/lxd/0 10.245.32.93 Unit is ready and clustered
ceph-mon/1 active idle 2/lxd/0 10.245.32.39 Unit is ready and clustered
ceph-mon/2 active idle 3/lxd/0 10.245.32.236 Unit is ready and clustered
ceph-osd/0* active idle 1 10.245.31.66 Unit is ready (1 OSD)
ceph-osd/1 active idle 2 10.245.33.153 Unit is ready (1 OSD)
ceph-osd/2 active idle 3 10.245.31.223 Unit is ready (1 OSD)
ceph-radosgw/0* active idle 0/lxd/0 10.245.33.112 80/tcp Unit is ready
cinder/0* active idle 1/lxd/1 10.245.32.46 8776/tcp Unit is ready
  cinder-ceph/0* active idle 10.245.32.46 Unit is ready
glance/0* active idle 2/lxd/1 10.245.31.219 9292/tcp Unit is ready
keystone/0* active idle 3/lxd/1 10.245.32.173 5000/tcp Unit is ready
mysql/0* active idle 0/lxd/1 10.245.32.45 Unit is ready
neutron-api/0* active idle 1/lxd/2 10.245.32.230 9696/tcp Unit is ready
neutron-gateway/0* active idle 0 10.245.32.187 Unit is ready
  ntp/0* unknown idle 10.245.32.187
nova-cloud-controller/0* maintenance executing 2/lxd/2 10.0.0.9 8774/tcp Running nova db migration
nova-compute/0* active idle 1 10.245.31.66 Unit is ready
  neutron-openvswitch/0* active idle 10.245.31.66 Unit is ready
  ntp/1 unknown idle 10.245.31.66
nova-compute/1 active idle 2 10.245.33.153 Unit is ready
  neutron-openvswitch/1 active idle 10.245.33.153 Unit is ready
  ntp/2 unknown idle 10.245.33.153
nova-compute/2 active idle 3 10.245.31.223 Unit is ready
  neutron-openvswitch/2 active idle 10.245.31.223 Unit is ready
  ntp/3 unknown idle 10.245.31.223
openstack-dashboard/0* active idle 3/lxd/2 10.245.32.194 80/tcp,443/tcp Unit is ready
rabbitmq-server/0* active idle 0/lxd/2 10.245.32.74 5672/tcp Unit is ready

Machine State DNS Inst id Series AZ
0 started 10.245.32.187 84cb43 xenial Production
0/lxd/0 started 10.245.33.112 juju-fea521-0-lxd-0 xenial
0/lxd/1 started 10.245.32.45 juju-fea521-0-lxd-1 xenial
0/lxd/2 started 10.245.32.74 juju-fea521-0-lxd-2 xenial
1 started 10.245.31.66 ffppyr xenial Production
1/lxd/0 started 10.245.32.93 juju-fea521-1-lxd-0 xenial
1/lxd/1 started 10.245.32.46 juju-fea521-1-lxd-1 xenial
1/lxd/2 started 10.245.32.230 juju-fea521-1-lxd-2 xenial
2 started 10.245.33.153 deftrk xenial Production
2/lxd/0 started 10.245.32.39 juju-fea521-2-lxd-0 xenial
2/lxd/1 started 10.245.31.219 juju-fea521-2-lxd-1 xenial
2/lxd/2 started 10.0.0.9 juju-fea521-2-lxd-2 xenial
3 started 10.245.31.223 4y3xan xenial Production
3/lxd/0 started 10.245.32.236 juju-fea521-3-lxd-0 xenial
3/lxd/1 started 10.245.32.173 juju-fea521-3-lxd-1 xenial
3/lxd/2 started 10.245.32.194 juju-fea521-3-lxd-2 xenial

Relation Provides Consumes Type
mon ceph-mon ceph-mon peer
mon ceph-mon ceph-osd regular
mon ceph-mon ceph-radosgw regular
ceph ceph-mon cinder-ceph regular
ceph ceph-mon glance regular
ceph ceph-mon nova-compute regular
cluster ceph-radosgw ceph-radosgw peer
identity-service ceph-radosgw keystone regular
cluster cinder cinder peer
storage-backend cinder cinder-ceph subordinate
image-service cinder glance regular
identity-service cinder keystone regular
shared-db cinder mysql regular
cinder-volume-service cinder nova-cloud-controller regular
amqp cinder rabbitmq-server regular
cluster glance glance peer
identity-service glance keystone regular
shared-db glance mysql regular
image-service glance nova-cloud-controller regular
image-service glance nova-compute regular
amqp glance rabbitmq-server regular
cluster keystone keystone peer
shared-db keystone mysql regular
identity-service keystone neutron-api regular
identity-service keystone nova-cloud-controller regular
identity-service keystone openstack-dashboard regular
cluster mysql mysql peer
shared-db mysql neutron-api regular
shared-db mysql nova-cloud-controller regular
cluster neutron-api neutron-api peer
neutron-plugin-api neutron-api neutron-gateway regular
neutron-plugin-api neutron-api neutron-openvswitch regular
neutron-api neutron-api nova-cloud-controller regular
amqp neutron-api rabbitmq-server regular
cluster neutron-gateway neutron-gateway peer
quantum-network-service neutron-gateway nova-cloud-controller regular
juju-info neutron-gateway ntp subordinate
amqp neutron-gateway rabbitmq-server regular
neutron-plugin neutron-openvswitch nova-compute regular
amqp neutron-openvswitch rabbitmq-server regular
cluster nova-cloud-controller nova-cloud-controller peer
cloud-compute nova-cloud-controller nova-compute regular
amqp nova-cloud-controller rabbitmq-server regular
neutron-plugin nova-compute neutron-openvswitch subordinate
compute-peer nova-compute nova-compute peer
juju-info nova-compute ntp subordinate
amqp nova-compute rabbitmq-server regular
ntp-peers ntp ntp peer
cluster openstack-dashboard openstack-dashboard peer
cluster rabbitmq-server rabbitmq-server peer

This is with Juju 2.1 beta4:
$ juju --version
2.1-beta4-xenial-amd64

Attached are logs from both container and the host. This is list of logs included: https://pastebin.canonical.com/176494/

Revision history for this message
Larry Michel (lmic) wrote :
description: updated
Revision history for this message
Andrew McDermott (frobware) wrote :

We are investigating similar bugs; it would be helpful if you could attach the juju logs from the controller machine and the container and host where the container came up on lxdbr0. For access to the container you'll probably have to go via `lxc exec juju-xxx-lxd-x bash` to get the logs off the machine. On top of that could you please attach the MAAS logs from the MAAS server; all of /var/log/maas/*.log would be useful.

Revision history for this message
Larry Michel (lmic) wrote :

@frobware juju logs are attached and included in logs2.tar.gz which is attached to the bug. You'll find content of /var/log for host and containers in machine2/ and juju-fea521-2-lxd-2/ directories. Attaching /var/log/maas/*.log now.

Revision history for this message
Larry Michel (lmic) wrote :

Attaching logs from the controller. Note that there was an attempted deployment today where the servers did not deploy because of an unrelated issue (maas not able to power on the servers). You can ignore that deployment. The deployment for this bug is from 1/18.

Changed in juju:
status: New → Triaged
importance: Undecided → High
milestone: none → 2.1.0
Revision history for this message
Anastasia (anastasia-macmood) wrote :

@Larry,

Could you please re-try with Juju 2.1-rc2?

Changed in juju:
status: Triaged → Incomplete
milestone: 2.1-rc2 → none
Revision history for this message
Larry Michel (lmic) wrote :

@Anastasia,

I have only seen it once and @alai saw it with beta5. I don't think this is easily recreatable, but we'll monitor for recreate with 2.1.x.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

I have recreated this with a 2.1.0.1 controller.

Changed in juju:
status: Incomplete → New
Changed in juju:
status: New → Triaged
tags: added: lxd maas maas-provider
Revision history for this message
Anastasia (anastasia-macmood) wrote :

We believe that this is related to bug # 1670873 as well as bug # 1656326.

Both have been addressed in later releases of Juju - 2.1.2 and 2.2-alpha1.

Could you please re-try with later version? Daily snap would be a great \o/

Changed in juju:
status: Triaged → Incomplete
Revision history for this message
John A Meinel (jameinel) wrote :

So I think there are several things going on:

If you look at machine-2.log from 'logs2.tar.gz' you see this line:
2017-01-18 21:03:36 WARNING juju.provisioner lxd-broker.go:62 failed to prepare container "2/lxd/2" network config: linking device interface "eth0" to subnet "10.245.0.0/18" failed: unexpected: ServerError: 504 GATEWAY TIMEOUT (Unexpected exception: TimeoutError. See /var/log/maas/regiond.log on the region server for more information.)
2017-01-18 21:03:36 WARNING juju.provisioner broker.go:97 incomplete DNS config found, discovering host's DNS config

I don't know why MAAS gave a 504 Gateway Timeout.

Juju will attempt to retry these type of failures, but can run into bug #1670873 (fixed in 2.1.2 whenever that gets released). The specific behavior of falling back to LXDBR0 based addresses is bug #1656326 which should already be fixed in the Juju 2.1.1 release.

So I'm reasonably ok treating this as just a dupe of bug #1656326 with the caveat that whatever the specific error is that is giving us 504 trying to talk to MAAS is an unknown bug. (But probably not worthy of keeping *this* bug open.)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.