lxc_hosts "create base container" with proxy settings are failing

Bug #1598324 reported by eil397
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
Low
Travis Truman

Bug Description

Hi

I was trying to use aio configruation with proxy settings on my vm.
http://docs.openstack.org/developer/openstack-ansible/install-guide/app-no-internet-connectivity.html#practice-b-proxy-access-to-internet-resources

Thank you for this page.

I have one issue:
- as part of task container_create[1] tool gpg should be executed to download and check key
 Module lxc[2] launches lxc-create(package lxc) and lxc-create launches lxc-download (script in package lxc-templates).

[1] https://github.com/openstack/openstack-ansible-lxc_container_create/blob/master/tasks/container_create.yml
[2] https://github.com/ansible/ansible-modules-extras/blob/devel/cloud/lxc/lxc_container.py
[3] https://github.com/lxc/lxc/blob/master/templates/lxc-download.in

My setup:
        ubuntu image from canonical Ubuntu 14.04.3 LTS (was tested also with 14.04.2)
 locally launched polipo with socks5
 root@ephimeralnodejune240103:/opt/openstack-ansible/playbooks# git branch -v
 * master 2e7c5ad Merge "Fix HAProxy config and install version when ssl is disabled"
 boostrap-ansible.sh && boostral-aio.sh
 in /etc/openstack_deploy/user_variables.yml added:
  global_environment_variables:
    HTTP_PROXY: "http://127.0.0.1:8123"
    HTTPS_PROXY: "http://127.0.0.1:8123"
    NO_PROXY: "localhost,127.0.0.1"
    http_proxy: "http://127.0.0.1:8123"
    no_proxy: "localhost,127.0.0.1"
 run-playbooks.sh executed.

Error output:
........
........
........
TASK [lxc_hosts : Create base container] ***************************************
task path: /etc/ansible/roles/lxc_hosts/tasks/lxc_cache.yml:16
<172.29.236.100> ESTABLISH SSH CONNECTION FOR USER: root
<172.29.236.100> SSH: EXEC ssh -C -vvv -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=120 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r 172.29.236.100 '/bin/sh -c '"'"'LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python && sleep 0'"'"''
FAILED - RETRYING: TASK: lxc_hosts : Create base container (2 retries left).Result was: {"attempts": 1, "command": "/usr/bin/lxc-create --name LXC_NAME --quiet --template download --bdev dir -- --dist ubuntu --release trusty --arch amd64 --force-cache --server images.linuxcontainers.org\n", "err": "", "failed": true, "invocation": {"module_args": {"archive": false, "archive_compression": "gzip", "archive_path": null, "backing_store": "dir", "clone_name": null, "clone_snapshot": false, "config": null, "container_command": null, "container_config": null, "container_log": false, "container_log_level": "INFO", "directory": null, "fs_size": "5G", "fs_type": "ext4", "lv_name": "LXC_NAME", "lxc_path": null, "name": "LXC_NAME", "state": "stopped", "template": "download", "template_options": "--dist ubuntu --release trusty --arch amd64 --force-cache --server images.linuxcontainers.org\n", "thinpool": null, "vg_name": "lxc", "zfs_root": null}, "module_name": "lxc_container"}, "msg": "Failed executing lxc-create.", "rc": 1, "retries": 3}
<172.29.236.100> ESTABLISH SSH CONNECTION FOR USER: root
<172.29.236.100> SSH: EXEC ssh -C -vvv -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=120 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r 172.29.236.100 '/bin/sh -c '"'"'LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python && sleep 0'"'"''
FAILED - RETRYING: TASK: lxc_hosts : Create base container (1 retries left).Result was: {"attempts": 2, "command": "/usr/bin/lxc-create --name LXC_NAME --quiet --template download --bdev dir -- --dist ubuntu --release trusty --arch amd64 --force-cache --server images.linuxcontainers.org\n", "err": "", "failed": true, "invocation": {"module_args": {"archive": false, "archive_compression": "gzip", "archive_path": null, "backing_store": "dir", "clone_name": null, "clone_snapshot": false, "config": null, "container_command": null, "container_config": null, "container_log": false, "container_log_level": "INFO", "directory": null, "fs_size": "5G", "fs_type": "ext4", "lv_name": "LXC_NAME", "lxc_path": null, "name": "LXC_NAME", "state": "stopped", "template": "download", "template_options": "--dist ubuntu --release trusty --arch amd64 --force-cache --server images.linuxcontainers.org\n", "thinpool": null, "vg_name": "lxc", "zfs_root": null}, "module_name": "lxc_container"}, "msg": "Failed executing lxc-create.", "rc": 1, "retries": 3}
<172.29.236.100> ESTABLISH SSH CONNECTION FOR USER: root
<172.29.236.100> SSH: EXEC ssh -C -vvv -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=120 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r 172.29.236.100 '/bin/sh -c '"'"'LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python && sleep 0'"'"''
fatal: [aio1]: FAILED! => {"changed": false, "command": "/usr/bin/lxc-create --name LXC_NAME --quiet --template download --bdev dir -- --dist ubuntu --release trusty --arch amd64 --force-cache --server images.linuxcontainers.org\n", "err": "", "failed": true, "invocation": {"module_args": {"archive": false, "archive_compression": "gzip", "archive_path": null, "backing_store": "dir", "clone_name": null, "clone_snapshot": false, "config": null, "container_command": null, "container_config": null, "container_log": false, "container_log_level": "INFO", "directory": null, "fs_size": "5G", "fs_type": "ext4", "lv_name": "LXC_NAME", "lxc_path": null, "name": "LXC_NAME", "state": "stopped", "template": "download", "template_options": "--dist ubuntu --release trusty --arch amd64 --force-cache --server images.linuxcontainers.org\n", "thinpool": null, "vg_name": "lxc", "zfs_root": null}, "module_name": "lxc_container"}, "msg": "Failed executing lxc-create.", "rc": 1}

command: /usr/bin/lxc-create --name LXC_NAME --quiet --template download --bdev dir -- --dist ubuntu --release trusty --arch amd64 --force-cache --server images.linuxcontainers.org

msg: Failed executing lxc-create.

command: /usr/bin/lxc-create --name LXC_NAME --quiet --template download --bdev dir -- --dist ubuntu --release trusty --arch amd64 --force-cache --server images.linuxcontainers.org

msg: Failed executing lxc-create.
 to retry, use: --limit @lxc-hosts-setup.retry

PLAY RECAP *********************************************************************
aio1 : ok=37 changed=5 unreachable=0 failed=1

TASK: lxc_hosts : Create base container ------------------------------ 1100.75s
TASK: pip_install : Install PIP ----------------------------------------- 2.42s
TASK: lxc_hosts : Create base directories ------------------------------- 1.57s
TASK: lxc_hosts : Drop base config file(s) ------------------------------ 1.48s
TASK: lxc_hosts : Install pip packages ---------------------------------- 1.14s
TASK: lxc_hosts : Install apt packages ---------------------------------- 0.83s
TASK: pip_install : Create pip config directory ------------------------- 0.77s
TASK: pip_install : Get Modern PIP -------------------------------------- 0.51s
TASK: pip_install : Drop pip file(s) ------------------------------------ 0.46s
TASK: lxc_hosts : Remove dnsmasq file(s) -------------------------------- 0.45s

Hit maximum number of retries, giving up...

+ cat /openstack/log/instance-info/host_info_1467417041.log

----------------------------------------------------------------------
I've checked that my variable are presented in /etc/environment but there is not such variable in environment of processes.
pstree -p
....
        ├─sshd(3673)───sh(30309)───python(30310)───python(30311)───lxc-create(30312)───lxc-download(30314)───gpg(30325)───gpgkeys_hkp(30326)
....
cat /proc/30325/environ | grep http|HTTP
there is no such variable in parent processes starting from sshd.

if I execute "ssh localhost", I can see this vars in my shell environment.
----------------------------------------------------------------------
Temporary solution that was workign for me:
 add proxy variables to .bashrc and next attempt will be successfull.
 I would like to add this variables in /etc/profile in https://github.com/openstack/openstack-ansible-lxc_container_create/blob/master/tasks/container_create.yml#L183

There was a bug about proxy (https://bugs.launchpad.net/openstack-ansible/+bug/1493570) but I think it is outdated. This one is related with gpg and current implementation of exporting variables though /etc/environment file.

Few things that was asked in irc chat:
   - UsePAM has yes in sshd_config.
   - pam_env.conf does not use /etc/environment. grep -c -v "^#.*$" /etc/security/pam_env.conf = 0

Maybe it somehow specific with 14.04.3 LTS. will repeat it on 14.04.4

eil397 (anton-haldin)
description: updated
Revision history for this message
Jean-Philippe Evrard (jean-philippe-evrard) wrote :

Thank you for your bug submission. We'll wait for your next input.

Revision history for this message
eil397 (anton-haldin) wrote :

I've checked issue with latest Ubuntu 14.04 LTS image:
$ md5 Downloads/trusty-server-cloudimg-amd64-disk1.img
MD5 (Downloads/trusty-server-cloudimg-amd64-disk1.img) = e4bb4500b34beb4419c74c7d3abc771a

root@ephimeralnodejul502-anton:/opt/openstack-ansible/playbooks# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.4 LTS"

Revision history for this message
eil397 (anton-haldin) wrote :

Need to check issue with lxc_cache_validate_certs: "no". https://review.openstack.org/#/c/284931/
As soon as process stuck on gpg process this variable should affect it.

Revision history for this message
eil397 (anton-haldin) wrote :

Looks like issue is caused by sudo.

For example when I'm testing with openstack-ansible-os_nova and launching ./run-test.sh

Process hangs on task:
TASK: [lxc_hosts | Create base container]

$pstree -p | grep ansible returns be such string

       ├─sshd(1216)───sshd(1270)───bash(1271)───sudo(1289)───su(1290)───bash(1291)───bash(1381)───tox(1875)───ansible-playboo(2092)───sh(6684)───sh(6685+

$pstree -p 2092 returns me:
ansible-playboo(2092)───sh(6684)───sh(6685)───sudo(6687)───sh(6688)───python(6689)───lxc-create(6690)───lxc-download(6692)───gpg(6709)───gpgkeys_hkp(6710+

next attemp:
$pstree -p 2092
ansible-playboo(2092)───sh(6721)───sh(6722)───sudo(6724)───sh(6725)───python(6726)───lxc-create(6727)───lxc-download(6729)───gpg(6773)───gpgkeys_hkp(6774+

$ strings /proc/6725/environ | grep http
$ strings /proc/6724/environ | grep http
HTTPS_PROXY=http://xx.xx.x.xx:8123
https_proxy=http://xx.xx.xx.xx:8123
http_proxy=http://xx.xx.xx.xx:8123
HTTP_PROXY=http://xx.xx.xx.xx:8123

$ ps ax | grep 6724
 6724 pts/0 S+ 0:00 sudo -H -S -p [sudo via ansible, key=wjafxzugogwfwinqmrkobraxivvcftbi] password: -u root /bin/sh -c echo BECOME-SUCCESS-wjafxzugogwfwinqmrkobraxivvcftbi; LC_CTYPE=en_US.UTF-8 LANG=en_US.UTF-8 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1468018262.67-201069123544133/lxc_container; rm -rf /root/.ansible/tmp/ansible-tmp-1468018262.67-201069123544133/ >/dev/null 2>&1

$ ps ax |grep 6726
 6726 pts/0 S+ 0:00 /usr/bin/python /root/.ansible/tmp/ansible-tmp-1468018262.67-201069123544133/lxc_container
 6795 pts/1 S+ 0:00 grep --color=auto 6726

It is well know and recommended behavior for sudo:
https://help.ubuntu.com/community/EnvironmentVariables

As local workaround for dev and testing environments I think this sudo configuration can be/should be changed by using conf setting like this:
 Defaults env_keep += "http_proxy no_proxy https_proxy NO_PROXY HTTPS_PROXY HTTP_PROXY"

eil397 (anton-haldin)
summary: - OSA in aio configuration with proxy settings failed
+ lxc_hosts "create base container" with proxy settings are failing
Revision history for this message
eil397 (anton-haldin) wrote :

I've checked version that execution through sudo truncate env variables ( master version).
After few tests it does not look like cause of issue.

At this moment most likely that after global environment variable were set, shell environment was not updated on host and new variables were not loaded yet.

In this case it looks like it is something that can be improved in doc. Something like: you should set your /etc/environment variables with proxy settings before launching scripts/run-playbooks.sh ( or gate-check-commit.sh , or tox with tox.ini)

Revision history for this message
Alexandra Settle (alexandra-settle) wrote :

Agreed on docimpact. Will take

Changed in openstack-ansible:
status: New → Triaged
importance: Undecided → Low
assignee: nobody → Alexandra Settle (alexandra-settle)
Changed in openstack-ansible:
assignee: Alexandra Settle (alexandra-settle) → Travis Truman (travis-truman)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to openstack-ansible (master)

Reviewed: https://review.openstack.org/345357
Committed: https://git.openstack.org/cgit/openstack/openstack-ansible/commit/?id=5c4d8b234facfc4e7da740c15e7f04171a1e8249
Submitter: Jenkins
Branch: master

commit 5c4d8b234facfc4e7da740c15e7f04171a1e8249
Author: Alexandra <email address hidden>
Date: Thu Jul 21 12:34:04 2016 +0100

    [DOCS] Update 'Practice B' with note

    Note is recommended warning about failing
    proxy settings

    Change-Id: I4d6d5df196c00b439a4d9e19aa098bb6246d03d1
    Closes-Bug: #1598324

Changed in openstack-ansible:
status: In Progress → Fix Released
Revision history for this message
eil397 (anton-haldin) wrote :

Thank you Alexandra.
I think it can be very helpful to highlight in documentation that proxy settings should be configured and checked before launching playbooks.

There is already sentence "Configure target and deployment hosts to reach public internet resources via HTTP or SOCKS proxy server(s)." in section "Practice B"[1]

But as soon as proxy settings can be very specific deployer should spent some time to check proxy settings. Because it can simplify troubleshooting and reduce time to localize potential cause of failure.

I've found that in mitaka it was working without additional preparation of deployment/target host (environment.j2 template should be used to set this variables. role: openstack-ansible-openstack_hosts, playbook: openstack-hosts-setup.yml).
That is why I've decided that it should work on master also. But looks like there are some difference. This difference somehow related with change around lxc cash logic. Role: openstack-ansible-lxc_container_create also has environment.j2.

I think I will create more specific bug if will found additional details about this case.

[1] - http://docs.openstack.org/developer/openstack-ansible/install-guide/app-no-internet-connectivity.html#practice-b-proxy-access-to-internet-resources

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/openstack-ansible 14.0.0.0b2

This issue was fixed in the openstack/openstack-ansible 14.0.0.0b2 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.