Local provider fails when an unrelated /home/ubuntu directory exists

Bug #1328958 reported by Max Brustkern
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
Medium
Ian Booth
1.21
Fix Released
Medium
Ian Booth
juju-core (Ubuntu)
Fix Released
High
Unassigned

Bug Description

I'm trying to init and bootstrap the juju local provider on utopic. I keep getting a chown: invalid user error:
max@eden:~$ rm -rf .juju
max@eden:~$ juju init
A boilerplate environment configuration file has been written to /home/max/.juju/environments.yaml.
Edit the file to configure your juju environment and run bootstrap.
max@eden:~$ juju switch local
amazon -> local
max@eden:~$ juju bootstrap
uploading tools for series [utopic precise trusty]
Logging to /home/max/.juju/local/cloud-init-output.log on remote host
chown: invalid user: ‘ubuntu:ubuntu’
Bootstrap failed, destroying environment
ERROR exit status 1
max@eden:~$

Revision history for this message
Robie Basak (racb) wrote :

I came across this on Trusty using ppa:juju/stable :

$ dpkg-query -W juju-core juju-local
juju-core 1.18.4-0ubuntu1~14.04.1~juju1
juju-local 1.18.4-0ubuntu1~14.04.1~juju1

So on Trusty, an ordinary user's desktop will not have an ubuntu user, and the local environment is completely broken on it.

It seems to me that for proper test coverage, the test environment also needs to be changed to not have an ubuntu user. I can understand why Juju must make assumptions about such a user (since cloud images ship with one), but this demonstrates why it is essential that the local environment is tested without one.

summary: - Local provider fails to bootstrap on utopic
+ Local provider assumes a local "ubuntu" user exists
Changed in juju-core (Ubuntu):
status: New → Triaged
importance: Undecided → High
Revision history for this message
Robie Basak (racb) wrote : Re: Local provider assumes a local "ubuntu" user exists

My steps to reproduce on a Trusty cloud image (off the top of my head - let me know if you have any problems):

1. sudo adduser --disabled-password --gecos '' otheruser
2. echo 'otheruser ALL=(ALL) NOPASSWD:ALL'|sudo tee --append /etc/sudoers
3. sudo add-apt-repository ppa:juju/stable
4. sudo apt-get install -y juju-core juju-local
5. Log in as otheruser (arrange authorized_keys first, etc).
6. sudo deluser ubuntu
7. juju generate-config
8. juju switch local
9. juju bootstrap --upload-tools --series trusty

uploading tools for series [trusty]
Logging to /home/otheruser/.juju/local/cloud-init-output.log on remote host
chown: invalid user: ‘ubuntu:ubuntu’
Bootstrap failed, destroying environment
ERROR exit status 1

Revision history for this message
Robie Basak (racb) wrote :

Filed bug 1332820 for test coverage.

Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Incomplete
Revision history for this message
Curtis Hovey (sinzui) wrote :

I do not this issue. None of our own machines have an ubuntu user and juju works. Juju requires the machines it provisions to have an ubuntu user.

Juju uses the login user as the identity for the local db and containers. The user account that is doing the deploy must export USER=<user>, this is the default behaviour in ubuntu, but extraordinary user/sh configuration will skip this step, then juju doesn't know who owns the container.

Revision history for this message
Curtis Hovey (sinzui) wrote :

The Juju CI tests do two non obvious things to test the local provider.
1. Never su to the testing account. Always log directly in.
2. Always export USER=jenkins so that juju knows which user owns the containers.

Revision history for this message
Robie Basak (racb) wrote :

I have reproduced the issue again on a fresh Trusty cloud image. It reproduces reliably just by removing the ubuntu user first. Exact reproduction steps below.

> Juju requires the machines it provisions to have an ubuntu user.

That's fine, but Juju must not assume that the local user has an ubuntu user to start with, when bootstrap a local provider environment. Juju is expected to run on an Ubuntu desktop machine. Ubuntu desktop machines do not typically have an ubuntu user.

So, to test this common case in an easily reproducible environment, I start by removing the ubuntu user, to bring a cloud image environment closer to that of a desktop user environment.

I first came across this bug on my laptop, which has never been a cloud environment and has never had an ubuntu user. Apparently Juju has the "ubuntu" user hardcoded somewhere in a code path that runs locally, which is a false assumption.

Exact steps to reproduce on a Trusty cloud image:

ubuntu@foo:~$ cat /etc/cloud/build.info
build_name: server
serial: 20140607.1

sudo -i

apt-get update && sudo apt-get -y dist-upgrade
reboot

sudo -i
adduser --disabled-password --gecos foo foo
cp -a ~ubuntu/.ssh ~foo/
chown -R foo. ~foo/.ssh
echo 'foo ALL=(ALL) NOPASSWD:ALL' > /etc/sudoers.d/foo

logout

# Log back in as foo here

sudo deluser ubuntu

sudo add-apt-repository -y ppa:juju/stable
sudo apt-get update
sudo apt-get -y install juju-core juju-local

apt-cache policy juju-core juju-local
juju-core:
  Installed: 1.18.4-0ubuntu1~14.04.1~juju1
  Candidate: 1.18.4-0ubuntu1~14.04.1~juju1
  Version table:
 *** 1.18.4-0ubuntu1~14.04.1~juju1 0
        500 http://ppa.launchpad.net/juju/stable/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
     1.18.1-0ubuntu1 0
        500 http://archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages
juju-local:
  Installed: 1.18.4-0ubuntu1~14.04.1~juju1
  Candidate: 1.18.4-0ubuntu1~14.04.1~juju1
  Version table:
 *** 1.18.4-0ubuntu1~14.04.1~juju1 0
        500 http://ppa.launchpad.net/juju/stable/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
     1.18.1-0ubuntu1 0
        500 http://archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages

foo@foo:~$ getent passwd ubuntu
foo@foo:~$ getent passwd foo
foo:x:1001:1001:foo,,,:/home/foo:/bin/bash
foo@foo:~$ echo $USER
foo

juju generate-config
juju switch local
juju bootstrap --series trusty --upload-tools

Actual results:

uploading tools for series [trusty]
Logging to /home/foo/.juju/local/cloud-init-output.log on remote host
chown: invalid user: ‘ubuntu:ubuntu’
Bootstrap failed, destroying environment
ERROR exit status 1

Expected results: successful local environment bootstrap.

Changed in juju-core:
status: Incomplete → New
Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Triaged
importance: Undecided → Medium
tags: added: local-provider
summary: - Local provider assumes a local "ubuntu" user exists
+ Local provider run on cloud images want ubuntu user
Revision history for this message
Max Brustkern (nuclearbob) wrote : Re: Local provider run on cloud images want ubuntu user

I still have the same problem on my workstation running utopic after a long series of upgrades, so I don't think it's exclusive to cloud images.

Curtis Hovey (sinzui)
summary: - Local provider run on cloud images want ubuntu user
+ Local provider run on server images requires an ubuntu user
Revision history for this message
Curtis Hovey (sinzui) wrote : Re: Local provider run on server images requires an ubuntu user

@robie
> cat /etc/cloud/build.info

^ this is a server image.
I can reproduce the error in the extraordinary condition of a server image without an ubuntu user.

juju local-provider just works with desktop and servers. desktops don't has an ubuntu user, servers have a default ubuntu user.

My own local lxc bootstrap
ls -l /home/curtis/.juju/local/cloud-init-output.log
-rw-r--r-- 1 root root 4218 Jun 24 11:23 /home/curtis/.juju/local/cloud-init-output.log

Test of local lxc bootstrap on a trusty server as another user (which I expect ubuntu to exist on)
$ ls -l $JUJU_HOME/local-trusty/cloud-init-output.log
-rw-r--r-- 1 root root 160425 Jun 24 15:34 /var/lib/jenkins/cloud-city/local-trusty/cloud-init-output.log

When I do remove the ubuntu user from a server, local bootstrap fails trying to make ubuntu the owner of the cloud-init-output.log. This is wrong because even when there is an ubuntu user, the log is owned by root. Desktop's are not effected, though maybe they can be is some some server packages were added. I imagine the issue relates to what did juju look at decide the localhost was a server.

The issue is present in all version of juju 1.18.x and 1.19.x. So while this is a real problem, it happens under special conditions. This doesn't block the release of 1.20.0

Revision history for this message
Robie Basak (racb) wrote :

I think I've found the problem:

environs/cloudinit/cloudinit.go:280

                        fmt.Sprintf(
                                `[ -e /home/ubuntu ] && (printf '%%s\n' %s > /home/ubuntu/.juju-proxy && chown ubuntu:ubuntu /home/ubuntu/.juju-proxy)`,
                                shquote(cfg.ProxySettings.AsScriptEnvironment())))

It looks like Juju assumes that the ubuntu user and groups exist if /home/ubuntu exists. I had a /home/ubuntu symlink on my laptop for debugging elsewhere, but no actual ubuntu user (that only existed in a schroot environment). Removing the symlink worked around the problem.

Max, please could you check your system for /home/ubuntu?

The correct way to do this is to also check for the ubuntu user and group, eg. using "getent passwd ubuntu" and "getent group ubuntu".

Revision history for this message
Robie Basak (racb) wrote :

A better way might be to copy the ownership of /home/ubuntu into /home/ubuntu/.juju-proxy.

But why was the code trying to touch my system's /home/ubuntu in the first place? My HOME is /home/robie, and my USER is robie. I think this bug goes deeper than this.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

There we go! I have /home/ubuntu left over from an old setup simulating the conditions in the CI lab, but I have no ubuntu user. I'll delete /home/ubuntu and try again.

Revision history for this message
Robie Basak (racb) wrote :

Max: any joy?

My conclusion for now is:

When running a local environment, Juju uses the existence of /home/ubuntu to determine if the ubuntu user is special. Instead, it should pay attention to what user my original user was (before sudo-ing, etc).

I understand that when deploying to any other environment the ubuntu user is special; Juju doesn't seem to correctly differentiate between these two cases. I guess the same code path is followed in both cases, or something?

Revision history for this message
Max Brustkern (nuclearbob) wrote : Re: [Bug 1328958] Re: Local provider run on server images requires an ubuntu user

The deployments I've been doing all need swift, so I had to get canonistack
working anyway. I'll retry this later this week.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

So mine is working if I remove /home/ubuntu. Is there something else I
should check?

Revision history for this message
Robie Basak (racb) wrote : Re: Local provider run on server images requires an ubuntu user

No that's fine - thank you for confirming.

summary: - Local provider run on server images requires an ubuntu user
+ Local provider fails when an unrelated /home/ubuntu directory exists
Revision history for this message
Dustin Kirkland  (kirkland) wrote :

I just hit this bug, helping a friend setup a local provider on his Digital Ocean account.

Wow, this is pretty nasty. Can we please get this fixed?

Ian Booth (wallyworld)
Changed in juju-core:
milestone: none → 1.22
Ian Booth (wallyworld)
Changed in juju-core:
milestone: 1.22 → 1.21-beta4
assignee: nobody → Ian Booth (wallyworld)
status: Triaged → In Progress
Tim Penhey (thumper)
Changed in juju-core:
milestone: 1.21-beta4 → 1.22
Ian Booth (wallyworld)
Changed in juju-core:
status: In Progress → Fix Committed
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
Changed in juju-core (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.