Initial bootstap failed at applying puppet manifest

Bug #1849671 reported by Yang Liu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
High
Al Bailey

Bug Description

Brief Description
-----------------
Initial bootstrap failed at puppet-manifest-apply.sh, with following error:
"changed": true, "cmd": ["/usr/local/bin/puppet-manifest-apply.sh", "/tmp/hieradata", "face::3", "controller", "ansible_bootstrap", ">", "/tmp/apply_manifest.log"], "delta": "0:03:24.648305", "end": "2019-10-24 06:37:15.595899", "msg": "non-zero return code", "rc": 1, "start": "2019-10-24 06:33:50.947594", "stderr": "cp: cannot stat ‘/tmp/hieradata/face::3.yaml’: No such file or directory\ncp: cannot stat ‘/tmp/hieradata/system.yaml’

Severity
--------
Critical

Steps to Reproduce
------------------
Install a standard system and bootstrap from controller-0

Expected Behavior
------------------
ansible-playbook cmd run successfully

Actual Behavior
----------------
Initial bootstrap failed at applying puppet manifest, with error:
"changed": true, "cmd": ["/usr/local/bin/puppet-manifest-apply.sh", "/tmp/hieradata", "face::3", "controller", "ansible_bootstrap", ">", "/tmp/apply_manifest.log"], "delta": "0:03:24.648305", "end": "2019-10-24 06:37:15.595899", "msg": "non-zero return code", "rc": 1, "start": "2019-10-24 06:33:50.947594", "stderr": "cp: cannot stat ‘/tmp/hieradata/face::3.yaml’: No such file or directory\ncp: cannot stat ‘/tmp/hieradata/system.yaml’

Reproducibility
---------------
Not sure - seen this on a regular system, have not tried to reproduce yet.
The simplex system with same load did not have this issue.

System Configuration
--------------------
Multi-node system
Lab-name:
WCP71-75

Branch/Pull Time/Commit
-----------------------
2019-10-23_20-00-00

Last Pass
---------
2019-10-22_20-00-00 on same system

Timestamp/Logs
--------------

E TASK [bootstrap/apply-bootstrap-manifest : Applying puppet bootstrap manifest] ***
E fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["/usr/local/bin/puppet-manifest-apply.sh", "/tmp/hieradata", "face::3", "controller", "ansible_bootstrap", ">", "/tmp/apply_manifest.log"], "delta": "0:03:24.648305", "end": "2019-10-24 06:37:15.595899", "msg": "non-zero return code", "rc": 1, "start": "2019-10-24 06:33:50.947594", "stderr": "cp: cannot stat ‘/tmp/hieradata/face::3.yaml’: No such file or directory\ncp: cannot stat ‘/tmp/hieradata/system.yaml’: No such file or directory\ncp: cannot stat ‘/tmp/hieradata/secure_system.yaml’: No such file or directory\ncp: cannot stat ‘>’: No such file or directory", "stderr_lines": ["cp: cannot stat ‘/tmp/hieradata/face::3.yaml’: No such file or directory", "cp: cannot stat ‘/tmp/hieradata/system.yaml’: No such file or directory", "cp: cannot stat ‘/tmp/hieradata/secure_system.yaml’: No such file or directory", "cp: cannot stat ‘>’: No such file or directory"], "stdout": "Applying puppet ansible_bootstrap manifest...\n[WARNING]\nWarnings found. See /var/log/puppet/2019-10-24-06-33-50_controller/puppet.log for details", "stdout_lines": ["Applying puppet ansible_bootstrap manifest...", "[WARNING]", "Warnings found. See /var/log/puppet/2019-10-24-06-33-50_controller/puppet.log for details"]}

Test Activity
-------------
Sanity

Revision history for this message
Al Bailey (albailey1974) wrote :

The puppet warning is:
2019-10-24T06:36:29.149 ^[[1;33mWarning: 2019-10-24 06:36:29 +0000 /Stage[main]/Barbican::Deps/Anchor[barbican::service::end]: Skipping because of failed dependencies^[[0m

Revision history for this message
Yang Liu (yliu12) wrote :

This issue is seen on another regular system, so far it has happened 2/2 on regular systems.
But is not seen on simplex and AIO-DX.

Revision history for this message
Yang Liu (yliu12) wrote :
Revision history for this message
Al Bailey (albailey1974) wrote :

/var/log/daemon.log:2019-10-24T06:36:10.783 localhost systemd[1]: notice Unit openstack-barbican-api.service entered failed state.
/var/log/daemon.log:2019-10-24T06:36:10.783 localhost systemd[1]: warning openstack-barbican-api.service failed.

controller-0:/var/log/barbican# systemctl status openstack-barbican-api.service
● openstack-barbican-api.service - Openstack Barbican API server
   Loaded: loaded (/usr/lib/systemd/system/openstack-barbican-api.service; disabled; vendor preset: disabled)
   Active: failed (Result: start-limit) since Thu 2019-10-24 06:36:10 UTC; 7h ago
 Main PID: 92132 (code=exited, status=217/USER)

controller-0:/var/log/barbican# systemctl status openstack-barbican-api.service
● openstack-barbican-api.service - Openstack Barbican API server
   Loaded: loaded (/usr/lib/systemd/system/openstack-barbican-api.service; disabled; vendor preset: disabled)
   Active: failed (Result: start-limit) since Thu 2019-10-24 13:58:20 UTC; 2s ago
  Process: 128245 ExecStop=/usr/bin/kill -s TERM $MAINPID (code=exited, status=217/USER)
  Process: 128243 ExecStart=/usr/bin/gunicorn --pid /run/barbican/pid -c /etc/barbican/gunicorn-config.py --paste /etc/barbican/barbican-api-paste.ini (code=exited, status=217/USER)
 Main PID: 128243 (code=exited, status=217/USER)

controller-0:/var/log/barbican# /usr/bin/gunicorn --pid /run/barbican/pid -c /etc/barbican/gunicorn-config.py --paste /etc/barbican/barbican-api-paste.ini
Invalid value for group: barbican

Error: No such group: 'barbican'

Revision history for this message
Al Bailey (albailey1974) wrote :

The service file is expecting a "barbican" user and group.

 systemctl cat openstack-barbican-api.service
# /usr/lib/systemd/system/openstack-barbican-api.service
[Unit]
Description=Openstack Barbican API server
After=syslog.target network.target
Before=httpd.service

[Service]
PIDFile=/run/barbican/pid
User=barbican
Group=barbican
RuntimeDirectory=barbican
RuntimeDirectoryMode=770
ExecStart=/usr/bin/gunicorn --pid /run/barbican/pid -c /etc/barbican/gunicorn-config.py --paste /etc/barbican/barbican-api-paste.ini
ExecReload=/usr/bin/kill -s HUP $MAINPID
ExecStop=/usr/bin/kill -s TERM $MAINPID
StandardError=syslog
Restart=on-failure

[Install]
WantedBy=multi-user.target

Revision history for this message
Al Bailey (albailey1974) wrote :

Neither the barbican user or group are setup.

Puppet shows:

2019-10-24T06:34:29.885 ^[[1;31mError: 2019-10-24 06:34:29 +0000 Could not set 'directory' on ensure: Could not find user barbican at /usr/share/puppet/modules/openstack/manifests/barbican.pp:29

Revision history for this message
Don Penney (dpenney) wrote :

The userid and group are setup by a preinstall in the openstack-barbican-common RPM. However, this package does not have a Requires against shadow-utils, which provides the useradd and groupadd utilities - instead, the openstack-barbican RPM has this Requires. In this installation, openstack-barbican-common installed before shadow-utils, and the useradd and groupadd commands failed.

Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Al Bailey (albailey1974)
importance: Undecided → High
status: New → Triaged
Revision history for this message
Al Bailey (albailey1974) wrote :

For other openstack components, this was resolved by the "setup" package.
https://opendev.org/starlingx/integ/src/branch/master/base/setup/centos/patches/0001-Change-group-passwd-and-uidgid.patch

barbican is missing from that file.
It will be added.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to integ (master)

Fix proposed to branch: master
Review: https://review.opendev.org/691046

Changed in starlingx:
status: Triaged → In Progress
Ghada Khalil (gkhalil)
tags: added: stx.3.0
tags: added: stx.config stx.sanity
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to integ (master)

Reviewed: https://review.opendev.org/691046
Committed: https://git.openstack.org/cgit/starlingx/integ/commit/?id=434159142363f319735caa87b0cc7fd6a1f71be1
Submitter: Zuul
Branch: master

commit 434159142363f319735caa87b0cc7fd6a1f71be1
Author: Al Bailey <email address hidden>
Date: Thu Oct 24 11:57:02 2019 -0500

    Ensure barbican user and group exist during installation

    The barbican user and group were missing from the setup files.

    Adding it ensures consistent uid/gid values across nodes, where
    filesystems may be shared.

    Adding it also ensures uid/gid exists when barbican is installed.
    This will fix sanity issues due to arbitrary rpm ordering during
    initial system installation.

    openstack-barbican-common has a scriptlet that sets up
    barbican user and group if they do not exist, through
    shadow-utils.

    The shadow-utils requirement is set for openstack-barbican
    rather than openstack-barbican-common or python-barbican.

    Alternatively the src rpm could be patched, but this would add
    source code patching debt, and still not resolve the filesystem
    consistency issue.

    Change-Id: I67b7c292e4a3356335df6619648284e028625fe6
    Closes-Bug: 1849671
    Signed-off-by: Al Bailey <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Yang Liu (yliu12) wrote :

Verified passed in recent sanity.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.