lxc containers created, juju can't seen or communicate with them

Bug #1357552 reported by Fernando Correa Neto
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
Ian Booth
1.20
Fix Released
High
Ian Booth

Bug Description

After deploying 10 units (juju deploy ubuntu -n 10) on the local provider, after some time, I could verify that juju was in the following state:

fcorrea@beast:~⟫ juju status
environment: local
machines:
  "0":
    agent-state: started
    agent-version: 1.20.4.1
    dns-name: localhost
    instance-id: localhost
    series: trusty
    state-server-member-status: has-vote
  "1":
    agent-state: started
    agent-version: 1.20.4.1
    dns-name: 10.0.3.148
    instance-id: fcorrea-local-machine-1
    series: trusty
    hardware: arch=amd64
  "2":
    agent-state: pending
    instance-id: fcorrea-local-machine-2
    series: trusty
    hardware: arch=amd64
  "3":
    agent-state: pending
    instance-id: fcorrea-local-machine-3
    series: trusty
    hardware: arch=amd64
  "4":
    agent-state: pending
    instance-id: fcorrea-local-machine-4
    series: trusty
    hardware: arch=amd64
  "5":
    agent-state: pending
    instance-id: fcorrea-local-machine-5
    series: trusty
    hardware: arch=amd64
  "6":
    agent-state: pending
    instance-id: fcorrea-local-machine-6
    series: trusty
    hardware: arch=amd64
  "7":
    agent-state: started
    agent-version: 1.20.4.1
    dns-name: 10.0.3.211
    instance-id: fcorrea-local-machine-7
    series: trusty
    hardware: arch=amd64
  "8":
    agent-state: started
    agent-version: 1.20.4.1
    dns-name: 10.0.3.224
    instance-id: fcorrea-local-machine-8
    series: trusty
    hardware: arch=amd64
  "9":
    agent-state: pending
    instance-id: fcorrea-local-machine-9
    series: trusty
    hardware: arch=amd64
  "10":
    agent-state: pending
    instance-id: fcorrea-local-machine-10
    series: trusty
    hardware: arch=amd64
services:
  ubuntu:
    charm: cs:trusty/ubuntu-0
    exposed: false
    units:
      ubuntu/0:
        agent-state: started
        agent-version: 1.20.4.1
        machine: "1"
        public-address: 10.0.3.148
      ubuntu/1:
        agent-state: pending
        machine: "2"
      ubuntu/2:
        agent-state: pending
        machine: "3"
      ubuntu/3:
        agent-state: pending
        machine: "4"
      ubuntu/4:
        agent-state: pending
        machine: "5"
      ubuntu/5:
        agent-state: pending
        machine: "6"
      ubuntu/6:
        agent-state: started
        agent-version: 1.20.4.1
        machine: "7"
        public-address: 10.0.3.211
      ubuntu/7:
        agent-state: started
        agent-version: 1.20.4.1
        machine: "8"
        public-address: 10.0.3.224
      ubuntu/8:
        agent-state: pending
        machine: "9"
      ubuntu/9:
        agent-state: pending
        machine: "10"

I've waited a bit longer but it didn't change.
After checking lxc-ls -f, I could verify the containers were indeed created and got IP's assigned:

fcorrea@beast:~⟫ sudo lxc-ls -f | grep fcorrea-local
fcorrea-local-machine-1 RUNNING 10.0.3.148 - YES
fcorrea-local-machine-10 RUNNING 10.0.3.117 - YES
fcorrea-local-machine-2 RUNNING 10.0.3.126 - YES
fcorrea-local-machine-3 RUNNING 10.0.3.116 - YES
fcorrea-local-machine-4 RUNNING 10.0.3.198 - YES
fcorrea-local-machine-5 RUNNING 10.0.3.3 - YES
fcorrea-local-machine-6 RUNNING 10.0.3.150 - YES
fcorrea-local-machine-7 RUNNING 10.0.3.211 - YES
fcorrea-local-machine-8 RUNNING 10.0.3.224 - YES
fcorrea-local-machine-9 RUNNING 10.0.3.70 - YES

Trying to juju ssh into one of the machines that didn't come up, does not work as expected:

fcorrea@beast:~⟫ juju ssh 6
ERROR machine "6" has no public address

However, it's possible to access it by its IP:

fcorrea@beast:~⟫ ssh ubuntu@10.0.3.150
Welcome to Ubuntu 14.04.1 LTS (GNU/Linux 3.16.0-031600rc6-lowlatency x86_64)

 * Documentation: https://help.ubuntu.com/

  System information as of Fri Aug 15 17:23:34 BRT 2014

  System load: 0.7 Processes: 24
  Usage of /: 40.8% of 102.58GB Users logged in: 0
  Memory usage: 22% IP address for eth0: 10.0.3.150
  Swap usage: 0%

  Graph this data and manage this system at:
    https://landscape.canonical.com/

  Get cloud support with Ubuntu Advantage Cloud Guest:
    http://www.ubuntu.com/business/services/cloud

0 packages can be updated.
0 updates are security updates.

Last login: Fri Aug 15 17:23:34 2014 from 10.0.3.1
ubuntu@fcorrea-local-machine-6:~$

Tags: landscape
Revision history for this message
Fernando Correa Neto (fcorrea) wrote :
Revision history for this message
Fernando Correa Neto (fcorrea) wrote :
description: updated
Revision history for this message
Ian Booth (wallyworld) wrote :

Could you please also ssh into one of the pending containers and grab the cloud init log. Also the machine-X.log directly.

Revision history for this message
Fernando Correa Neto (fcorrea) wrote :
Revision history for this message
Fernando Correa Neto (fcorrea) wrote :
Revision history for this message
Fernando Correa Neto (fcorrea) wrote :

In this case, I grabbed machine 6 cloud init logs but there's no machine-6.log. The only I could see is unit-ubuntu-6.log, which I'm attaching.

Revision history for this message
Fernando Correa Neto (fcorrea) wrote :
Revision history for this message
Ian Booth (wallyworld) wrote :

Thanks for attaching the log files. The cloud init log contained some interesting information:

+ curl -sSfw tools from %{url_effective} downloaded: HTTP %{http_code}; time %{time_total}s; size %{size_download} bytes; speed %{speed_download} bytes/s --retry 10 -o /var/lib/juju/tools/1.20.4.1-trusty-amd64/tools.tar.gz http://10.0.3.1:8040/tools/releases/juju-1.20.4.1-trusty-amd64.tgz
curl: (7) Couldn't connect to server
tools from http://10.0.3.1:8040/tools

So, some of the lxc containers failed to be able to connect to the http service started on the host in order to get the tools. The reason for this failure to connect, which does not affect all of the containers, need to be determined.

Changed in juju-core:
status: New → Triaged
importance: Undecided → High
Ian Booth (wallyworld)
Changed in juju-core:
milestone: none → 1.21-alpha1
Revision history for this message
Ian Booth (wallyworld) wrote :

I have tried to reproduce this but thus far, after several attempts, have not been successful.
How often does it occur for you?

Revision history for this message
Fernando Correa Neto (fcorrea) wrote : Re: [Bug 1357552] Re: lxc containers created, juju can't seen or communicate with them
Download full text (5.7 KiB)

I just tried a few rounds (5 deployments) and I could not see it
happening. Although this is the same juju-core version, it's a
different machine.
I'll try again later on the machine where it happened.

On Sun, Aug 17, 2014 at 10:33 PM, Ian Booth <email address hidden> wrote:
> I have tried to reproduce this but thus far, after several attempts, have not been successful.
> How often does it occur for you?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1357552
>
> Title:
> lxc containers created, juju can't seen or communicate with them
>
> Status in juju-core:
> Triaged
> Status in juju-core 1.20 series:
> Triaged
>
> Bug description:
> After deploying 10 units (juju deploy ubuntu -n 10) on the local
> provider, after some time, I could verify that juju was in the
> following state:
>
> fcorrea@beast:~⟫ juju status
> environment: local
> machines:
> "0":
> agent-state: started
> agent-version: 1.20.4.1
> dns-name: localhost
> instance-id: localhost
> series: trusty
> state-server-member-status: has-vote
> "1":
> agent-state: started
> agent-version: 1.20.4.1
> dns-name: 10.0.3.148
> instance-id: fcorrea-local-machine-1
> series: trusty
> hardware: arch=amd64
> "2":
> agent-state: pending
> instance-id: fcorrea-local-machine-2
> series: trusty
> hardware: arch=amd64
> "3":
> agent-state: pending
> instance-id: fcorrea-local-machine-3
> series: trusty
> hardware: arch=amd64
> "4":
> agent-state: pending
> instance-id: fcorrea-local-machine-4
> series: trusty
> hardware: arch=amd64
> "5":
> agent-state: pending
> instance-id: fcorrea-local-machine-5
> series: trusty
> hardware: arch=amd64
> "6":
> agent-state: pending
> instance-id: fcorrea-local-machine-6
> series: trusty
> hardware: arch=amd64
> "7":
> agent-state: started
> agent-version: 1.20.4.1
> dns-name: 10.0.3.211
> instance-id: fcorrea-local-machine-7
> series: trusty
> hardware: arch=amd64
> "8":
> agent-state: started
> agent-version: 1.20.4.1
> dns-name: 10.0.3.224
> instance-id: fcorrea-local-machine-8
> series: trusty
> hardware: arch=amd64
> "9":
> agent-state: pending
> instance-id: fcorrea-local-machine-9
> series: trusty
> hardware: arch=amd64
> "10":
> agent-state: pending
> instance-id: fcorrea-local-machine-10
> series: trusty
> hardware: arch=amd64
> services:
> ubuntu:
> charm: cs:trusty/ubuntu-0
> exposed: false
> units:
> ubuntu/0:
> agent-state: started
> agent-version: 1.20.4.1
> machine: "1"
> public-address: 10.0.3.148
> ubuntu/1:
> agent-state: pending
> machine: "2"
> ubuntu/2:
> agent-state: pending
> machine: "3"
> ubuntu/3:
> agent-state: pending
> machine: "4...

Read more...

Ian Booth (wallyworld)
Changed in juju-core:
assignee: nobody → Ian Booth (wallyworld)
status: Triaged → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.