containers have stale keys and need the ability to update on deployment

Bug #1464771 reported by Kevin Carter
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Fix Released
Medium
Kevin Carter
Kilo
Fix Released
Medium
Kevin Carter
Trunk
Fix Released
Medium
Kevin Carter

Bug Description

At deployment time the lxc containers need the ability to update all apt cache and get new keys. The lxc-container-create role should be using the raw module to do some of these basic operations as we can not assume that Ansible has the ability to communicate with the container through normal ansible means due to Python2.7 is not being installed.

Tags: in-kilo
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-ansible-deployment (master)

Fix proposed to branch: master
Review: https://review.openstack.org/191215

Changed in openstack-ansible:
status: Confirmed → In Progress
Revision history for this message
Kevin Carter (kevin-carter) wrote :

One of the things found here was that in the HP region within the OpenStack CI system the images have stale apt keys which require running an apt-key update as well as an apt-get update. Failure to run these tasks manually results in intermittent failure when installing packages within the given containers.

Revision history for this message
Kevin Carter (kevin-carter) wrote :

After much mucking about with all of this i found that old keys, stale apt cache and speed were all the enemy here. The container build process is running fast enough that its not guaranteed that networking is online within a container before running apt-get update/install commands. While "it should" work there are cases when a slow host can not get a container and its network up fast enough to deal with the running of online commands in a timely order. This causes random failures and in some cases apt simply hangs for a very long time before eventually giving up.

So to resolve this issue I broke up the container build task which contained a blob command to "do all the things" into smaller chunks. The change makes the role more idomatic ansible, its now easier to debug / read and provides deployers tags on a given task to isolate retry efforts.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-ansible-deployment (kilo)

Fix proposed to branch: kilo
Review: https://review.openstack.org/191517

Revision history for this message
Kevin Carter (kevin-carter) wrote :

Upon further review I've found that the apt module is not reliably updating the apt cache when specified with an apt package list. This issue is directly related to a bug found with the ansible core modules here: [ https://github.com/ansible/ansible-modules-core/issues/1497 ]. To resolve this we need to move our cache update calls into its own task which will ensure that the apt cache is always updated as expected.

These are the following locations in the roles/playbooks that will need to be updated to fix this:

playbooks/utility-install.yml:29: update_cache: yes
playbooks/roles/openstack_hosts/tasks/openstack_host_packages.yml:20: update_cache: yes
playbooks/roles/lxc_hosts/tasks/lxc_install.yml:20: update_cache: yes
playbooks/roles/os_horizon/tasks/horizon_install.yml:20: update_cache: yes
playbooks/roles/rabbitmq_server/tasks/rabbitmq_install.yml:20: update_cache: yes
playbooks/roles/os_glance/tasks/glance_install.yml:20: update_cache: yes
playbooks/roles/galera_client/tasks/galera_client_install.yml:20: update_cache: yes
playbooks/roles/rsyslog_client/tasks/rsyslog_client_install.yml:32: update_cache: yes
playbooks/roles/os_cinder/tasks/cinder_install.yml:20: update_cache: yes
playbooks/roles/os_swift/tasks/swift_install.yml:20: update_cache: yes
playbooks/roles/os_nova/tasks/nova_install.yml:20: update_cache: yes
playbooks/roles/os_nova/tasks/nova_spice_console_install.yml:20: update_cache: yes
playbooks/roles/os_nova/tasks/nova_compute_kvm_install.yml:20: update_cache: yes
playbooks/roles/os_keystone/tasks/keystone_install.yml:20: update_cache: yes
playbooks/roles/os_neutron/tasks/neutron_install.yml:20: update_cache: yes
playbooks/roles/rsyslog_server/tasks/rsyslog_server_install.yml:32: update_cache: yes
playbooks/roles/haproxy_server/tasks/haproxy_install.yml:20: update_cache: yes
playbooks/roles/repo_server/tasks/repo_install.yml:20: update_cache: yes
playbooks/roles/os_heat/tasks/heat_install.yml:20: update_cache: yes
playbooks/roles/memcached_server/tasks/memcached_install.yml:20: update_cache: yes
playbooks/roles/galera_server/tasks/galera_pre_install.yml:20: update_cache: yes
playbooks/roles/galera_server/tasks/galera_install.yml:30: update_cache: yes

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to os-ansible-deployment (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/191528

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-ansible-deployment (master)

Reviewed: https://review.openstack.org/191215
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=86b6e8e6409f76c46551becdac783a76bb853980
Submitter: Jenkins
Branch: master

commit 86b6e8e6409f76c46551becdac783a76bb853980
Author: kevin <email address hidden>
Date: Fri Jun 12 15:59:05 2015 -0500

    Updates the container build process

    The container build process needs to make sure that the service
    sources are correctly setup and updated prior to running any other
    playbooks.

    The modification here is nessisary to break out the process for the
    proxy create, apt sources deployment, the update of those sources
    and keys, container upgrades and the installation of python2.7 for
    use with Ansible. This also allows for better debugging of a failure
    in container create as we'll now be able to tell where in the process
    a failure happens and be able to use tags to resolve it.

    Change-Id: I36be437303a73bbc98a1cd5297f6c65591653cd7
    Closes-Bug: 1464771

Changed in openstack-ansible:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-ansible-deployment (kilo)

Reviewed: https://review.openstack.org/191517
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=32344a6650ba0331b44041d91f2c594bed1499c3
Submitter: Jenkins
Branch: kilo

commit 32344a6650ba0331b44041d91f2c594bed1499c3
Author: kevin <email address hidden>
Date: Fri Jun 12 15:59:05 2015 -0500

    Updates the container build process

    The container build process needs to make sure that the service
    sources are correctly setup and updated prior to running any other
    playbooks.

    The modification here is nessisary to break out the process for the
    proxy create, apt sources deployment, the update of those sources
    and keys, container upgrades and the installation of python2.7 for
    use with Ansible. This also allows for better debugging of a failure
    in container create as we'll now be able to tell where in the process
    a failure happens and be able to use tags to resolve it.

    Change-Id: I36be437303a73bbc98a1cd5297f6c65591653cd7
    Closes-Bug: 1464771
    (cherry picked from commit 86b6e8e6409f76c46551becdac783a76bb853980)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to os-ansible-deployment (master)

Reviewed: https://review.openstack.org/191528
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=59381b51ffb83bd67e13d5f04ff731f1b70a67bf
Submitter: Jenkins
Branch: master

commit 59381b51ffb83bd67e13d5f04ff731f1b70a67bf
Author: kevin <email address hidden>
Date: Sat Jun 13 16:34:05 2015 -0500

    Added apt update tasks to everything using apt

    This change adds a specific update task to all tasks that all the
    apt ansible module. This change was done to ensure that the cache
    is updated as expected when instructed to do so. The reason that
    the cache update is being removed from the grouping is because
    there is an upstream bug that is effecting the process by which
    the apt cache is updated when there is a package list to process
    within the same task. The work around to make this function as
    expected is to move the update into its own task without a package
    list.

    Upstream Ansible bug:
      - https://github.com/ansible/ansible-modules-core/issues/1497

    Change-Id: Ic06d89a76d772c12888b4bc4bbf147be58b0c150
    Related-Bug: 1464771

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to os-ansible-deployment (kilo)

Related fix proposed to branch: kilo
Review: https://review.openstack.org/192484

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to os-ansible-deployment (kilo)

Reviewed: https://review.openstack.org/192484
Committed: https://git.openstack.org/cgit/stackforge/os-ansible-deployment/commit/?id=ea2681280ae4885e5cb333d573ed43ba92fdd08f
Submitter: Jenkins
Branch: kilo

commit ea2681280ae4885e5cb333d573ed43ba92fdd08f
Author: kevin <email address hidden>
Date: Sat Jun 13 16:34:05 2015 -0500

    Added apt update tasks to everything using apt

    This change adds a specific update task to all tasks that all the
    apt ansible module. This change was done to ensure that the cache
    is updated as expected when instructed to do so. The reason that
    the cache update is being removed from the grouping is because
    there is an upstream bug that is effecting the process by which
    the apt cache is updated when there is a package list to process
    within the same task. The work around to make this function as
    expected is to move the update into its own task without a package
    list.

    Upstream Ansible bug:
      - https://github.com/ansible/ansible-modules-core/issues/1497

    Change-Id: Ic06d89a76d772c12888b4bc4bbf147be58b0c150
    Related-Bug: 1464771
    (cherry picked from commit 59381b51ffb83bd67e13d5f04ff731f1b70a67bf)

tags: added: in-kilo
Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote : Fix included in openstack/openstack-ansible 11.2.11

This issue was fixed in the openstack/openstack-ansible 11.2.11 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/openstack-ansible 11.2.12

This issue was fixed in the openstack/openstack-ansible 11.2.12 release.

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote : Fix included in openstack/openstack-ansible 11.2.14

This issue was fixed in the openstack/openstack-ansible 11.2.14 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.