kargo-deploy.sh pgrep sleep

Bug #1655414 reported by Robert Duncan
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
fuel-ccp
Fix Released
Undecided
Unassigned

Bug Description

trying to follow the quick start guide - but kargo-deploy.sh checks apt for a lock on line 213

 while admin_node_command pgrep -a -f apt; do echo 'Waiting for apt lock...'; sleep 30; done

seems this returns lock = true, when I take it out it works.

Changed in fuel-ccp:
status: New → Confirmed
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

Robert, this means some apt process is running on your host. The reason why kargo-deploy.sh waits for apt process is because you need to wait for the lock to expire in order to install required packages. Can you ssh into your host and see what apt process is running using `pgrep -a -f apt`?

Revision history for this message
Robert Duncan (rduncan-t) wrote :

Hi Matthew, yes I tried that before editing the script - while I run "pgrep -a -f apt" on the remote host it returns no running process, however when I run it from the admin node as the script does it returns a process number which seems to point to the process number of pgrep itself.

In this case I am using the Admin node as a kube node also

ubuntu@node1:/$ ssh -A -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null vagrant@192.168.111.250 pgrep -a -f apt
Warning: Permanently added '192.168.111.250' (ECDSA) to the list of known hosts.
Password:
2658 ssh -A -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null vagrant@192.168.111.250 pgrep -a -f apt
ubuntu@node1:/$
ubuntu@node1:/$
ubuntu@node1:/$ ps -p 2658 -o comm=
ubuntu@node1:/$
ubuntu@node1:/$

so the admin node preforms ssh into itself and finds the pgrep process, waits 30 secs, runs pgrep again, gets process number again and so on.

it doesn't happen on the nodes which are purely kube nodes

ubuntu@node1:/$ ssh -A -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null vagrant@192.168.111.254 pgrep -a -f apt
Warning: Permanently added '192.168.111.254' (ECDSA) to the list of known hosts.
vagrant@192.168.111.254's password:
ubuntu@node1:/$
ubuntu@node1:/$
ubuntu@node1:/$

as the example in quick start guide is the same as my senario, then maybe the pgrep shouldn't run on dual purpose node

Deploy k8s cluster

Clone fuel-ccp-installer repository:

git clone https://review.openstack.org/openstack/fuel-ccp-installer
Create deployment script:

cat > ./deploy-k8s.sh << EOF
#!/bin/bash
set -ex

# CHANGE ADMIN_IP AND SLAVE_IPS TO MATCH YOUR ENVIRONMENT
export ADMIN_IP="10.90.0.2"
export SLAVE_IPS="10.90.0.2 10.90.0.3 10.90.0.4"
export DEPLOY_METHOD="kargo"
export WORKSPACE="${HOME}/workspace"

mkdir -p $WORKSPACE
cd ./fuel-ccp-installer
bash -x "./utils/jenkins/run_k8s_deploy_test.sh"
EOF

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

Are you saying that you're running this script from 10.90.0.2 and you set ADMIN_IP=10.90.0.2? Try changing ADMIN_IP="local".

Revision history for this message
Robert Duncan (rduncan-t) wrote :

I have deployed it now, but yes I ran it as shown in the example.

when together with the example this did not make any sense to me:

ADMIN_IP - IP of the node which will run ansible. When the $ADMIN_IP refers to a remote node, like a VM, it should take an IP address. Otherwise, it should take the local value.

the local value of what? .....Otherwise, it should take the value of 'local'
and - localhost has been perfectly serviceable, it also resolves to an IP address 127.0.0.1, very arbitrary -it might as well be ADMIN_IP="jeff", so sad :-)

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

Sorry that the docs aren't clear. I proposed a fix https://review.openstack.org/#/c/418905/ that should cover the case where you specify your own IP in ADMIN_IP

Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

"should take the local value" should be 'should be set to the value "local"'

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-ccp-installer (master)

Reviewed: https://review.openstack.org/418905
Committed: https://git.openstack.org/cgit/openstack/fuel-ccp-installer/commit/?id=f28413491de542a9a4fe7f01e0a87ddf30871d21
Submitter: Jenkins
Branch: master

commit f28413491de542a9a4fe7f01e0a87ddf30871d21
Author: Matthew Mosesohn <email address hidden>
Date: Wed Jan 11 15:36:59 2017 +0300

    Avoid catching pgrep apt from ssh source to self

    Change-Id: I94957d362d520f9bc9bcd90c22d05c88a6f05435
    Closes-Bug: #1655414

Changed in fuel-ccp:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.