Zun: Error on running privsep helper command

Bug #1787760 reported by hongbin
38
This bug affects 6 people
Affects Status Importance Assigned to Milestone
kolla-ansible
Fix Released
Medium
hongbin
Rocky
Fix Released
Medium
hongbin
Stein
Fix Released
Medium
hongbin

Bug Description

Deploy Zun by using kolla-ansible master. The zun-compute container failed on running privsep helper command:

$ sudo docker exec zun_compute tail -n 30 /var/log/kolla/zun/zun-compute.log
2018-08-18 19:56:39.680 8 ERROR oslo_service.periodic_task channel = daemon.RootwrapClientChannel(context=self)
2018-08-18 19:56:39.680 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 327, in __init__
2018-08-18 19:56:39.680 8 ERROR oslo_service.periodic_task raise FailedToDropPrivileges(msg)
2018-08-18 19:56:39.680 8 ERROR oslo_service.periodic_task FailedToDropPrivileges: privsep helper command exited non-zero (96)
2018-08-18 19:56:39.680 8 ERROR oslo_service.periodic_task
2018-08-18 19:57:39.480 8 INFO oslo.privsep.daemon [-] Running privsep helper: ['sudo', 'zun-rootwrap', '/etc/zun/rootwrap.conf', 'privsep-helper', '--config-file', '/etc/zun/zun.conf', '--privsep_context', 'zun.common.privileged.default', '--privsep_sock_path', '/tmp/tmprt1XgU/privsep.sock']
2018-08-18 19:57:39.683 8 WARNING oslo.privsep.daemon [-] privsep log: /var/lib/kolla/venv/bin/zun-rootwrap: Executable not found: privsep-helper (filter match = privsep-helper)
2018-08-18 19:57:39.696 8 CRITICAL oslo.privsep.daemon [-] privsep helper command exited non-zero (96)
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task [-] Error during Manager.inventory_host: FailedToDropPrivileges: privsep helper command exited non-zero (96)
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task Traceback (most recent call last):
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_service/periodic_task.py", line 220, in run_periodic_tasks
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task task(self, context)
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/compute/manager.py", line 1059, in inventory_host
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task rt.update_available_resources(context)
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/compute/compute_node_tracker.py", line 65, in update_available_resources
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task self.container_driver.get_available_resources(node)
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/container/driver.py", line 251, in get_available_resources
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task disk_total = self.get_total_disk_for_container()
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/container/docker/driver.py", line 1095, in get_total_disk_for_container
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task run_as_root=True)
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/common/utils.py", line 352, in execute
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task return execute_root(*cmd, **kwargs)
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 206, in _wrap
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task self.start()
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 217, in start
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task channel = daemon.RootwrapClientChannel(context=self)
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 327, in __init__
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task raise FailedToDropPrivileges(msg)
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task FailedToDropPrivileges: privsep helper command exited non-zero (96)
2018-08-18 19:57:39.698 8 ERROR oslo_service.periodic_task

Revision history for this message
hongbin (hongbin034) wrote :

Note: Starting from Rocky, Zun has switched to privsep to execute privilege commands: https://bugs.launchpad.net/zun/+bug/1749342

Changed in kolla-ansible:
assignee: nobody → hongbin (hongbin034)
Revision history for this message
Andreas Merk (amerk) wrote :

When I compare the rootwrap.conf from nova, then the execution path in the venv is missing.

sun-compute:/etc/zun/rootwrap.conf
exec_dirs=/sbin,/usr/sbin,/bin,/usr/bin,/usr/local/bin,/usr/local/sbin

Revision history for this message
Andreas Merk (amerk) wrote :

This should fix it:
add in kolla/docker/zun/zun-base/Dockerfile.j2
the line:
sed -i 's|^exec_dirs.*|exec_dirs=/var/lib/kolla/venv/bin,/sbin,/usr/sbin,/bin,/usr/bin,/usr/local/bin,/usr/local/sbin|g' /etc/zun/rootwrap.conf
after line (and don't forget the "\"):
&& chown -R zun: /etc/zun /var/www/cgi-bin/zun \

Revision history for this message
hongbin (hongbin034) wrote :
Changed in kolla-ansible:
status: New → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 7.0.0.0rc3

This issue was fixed in the openstack/kolla 7.0.0.0rc3 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla 8.0.0.0b1

This issue was fixed in the openstack/kolla 8.0.0.0b1 development milestone.

Mark Goddard (mgoddard)
Changed in kolla-ansible:
status: Fix Committed → Fix Released
Revision history for this message
alpha23 (alpha23) wrote :

Will this issue be fixed in Rocky? As of 6/15/20, this was still failing in Rocky per https://bugs.launchpad.net/kolla-ansible/+bug/1883604

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

It's in https://review.opendev.org/737198

I guess the CI is unhappy.

Revision history for this message
alpha23 (alpha23) wrote :

When is the CI expected to be happy again?

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

We have been hit by lots of issues we didn't have control over. We are slowly recovering. EM (Extended Maintenance) branches are not the main focus, hence why it takes the longest. External help is appreciated.

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

It's getting merged.

Revision history for this message
alpha23 (alpha23) wrote :

What is the kolla-ansible's EM branch support policy? Ocata and Pike are still in EM, e.g. https://releases.openstack.org/ocata/, but kolla-ansible has both releases marked as 'Obsolete.'

Revision history for this message
alpha23 (alpha23) wrote :

Also, it appears that the releases specified in the timeline, https://launchpad.net/kolla-ansible/+series, do not correspond to what is available in pip, e.g.

ERROR: Could not find a version that satisfies the requirement kolla-ansible==8.2.1 (from versions: 4.0.0.0b2, 4.0.0.0b3, 4.0.0.0rc1, 4.0.0.0rc2, 4.0.0, 4.0.1, 4.0.2, 4.0.3, 4.0.4, 4.0.5, 5.0.0.0b2, 5.0.0.0b3, 5.0.0.0rc1, 5.0.0.0rc2, 5.0.0, 5.0.1, 5.0.2, 5.0.3, 5.0.4, 5.0.5, 5.0.6, 6.0.0.0b2, 6.0.0.0b3, 6.0.0.0rc1, 6.0.0.0rc2, 6.0.0, 6.1.0, 6.1.1, 6.2.0, 6.2.1, 6.2.2, 6.2.3, 7.0.0.0b2, 7.0.0.0b3, 7.0.0.0rc1, 7.0.0.0rc2, 7.0.0.0rc3, 7.0.0, 7.0.1, 7.1.0, 7.1.1, 7.1.2, 7.2.0, 7.2.1, 8.0.0.0b1, 8.0.0.0rc1, 8.0.0.0rc2, 8.0.0, 8.0.1, 8.1.0, 8.1.1, 8.2.0, 8.3.0, 9.0.0.0rc1, 9.0.0.0rc2, 9.0.0.0rc3, 9.0.0, 9.0.1, 9.1.0, 9.2.0, 9.3.0, 9.3.1)
ERROR: No matching distribution found for kolla-ansible==8.2.1

Please advise.

Revision history for this message
Mark Goddard (mgoddard) wrote :

Hi. Typically what we aim for is to have one release in EM at any time. Currently that is Stein. In that phase we will accept patches but will not actively maintain the code.

We make no promises upstream - maintenance (as opposed to support) is provided based on the resources available in the team at any time. We are happy to accept help.

As for the releases, the final release was 8.3.0 rather than 8.2.1. LP requires us to enter a version number but when the release is cut SemVer may require us to change the number. On this occasion I forgot to update LP to match. Please share if there are other discrepancies.

Revision history for this message
alpha23 (alpha23) wrote :

Per the timeline, https://launchpad.net/kolla-ansible/+series, Queens' status is 'Current Stable Release' and Rocky's status is 'Supported.' Also, Ussari's status is ' Pre-release Freeze' Is this information correct?

If there is a way to extend support for those releases that have defects, that would alleviate upgrade concerns. For example, Rocky has 12 new and 4 in-progress defects which would be a concern if I were to be upgrading to Rocky. These defects would not necessarily be an issue if a test environment was being upgraded but upgrading a production environment is a different story.

Importantly though, there are often on-going defects in the underlying services, unrelated to Kolla-ansible, which can cause issues/make upgrading not an immediate option. Removing support for kolla-ansible prior to these underlying services resolving defects is a risky upgrade strategy in a production environment. I appreciate your support and responses to issues/questions.

Revision history for this message
Mark Goddard (mgoddard) wrote :

I will update the status of the releases.

It's simply a matter of what is practical. We are a small core team, often from start ups, universities etc, and in some cases maintaining Kolla may not even be part of our $job. OpenStack has two releases a year. Our current focus is Wallaby, and there are 4 releases between there and Rocky. Keeping all of those in good shape with working CI would be a very large endeavour.

My advice would be to stay as close to the latest release as you are able to, and not fall behind. Without commercial support that's the only way to play.

Revision history for this message
alpha23 (alpha23) wrote :

I realize our conversation is off-topic of the original issue but this may be as good of a place as any to have it. Firstly, I want to complement the kolla-ansible team on responding to these issues and I do understand your statements above.

However, the bifracted support schedules between the kolla-ansible and the rest of the OS community, see also https://docs.openstack.org/project-team-guide/stable-branches.html#maintenance-phases and https://releases.openstack.org/, creates issues as bug fixes are rolled into extended maintenance (EM) branches but these changes are not at least regression tested with kolla-ansible.

One of the reasons for the the change EM schedules was obviously to reduce the pain associated with upgrading. As a reference, my company's upgrade from Queens to Rocky via kolla-ansible caused several weeks of downtime. Beyond debugging at least one of the core project upgrades, many of the services just do not work correctly and therefore have been removed/are not used. For example, Freezer's and Karbor's dashboard breaks Horizon and therefore had to be removed. Searchlight also fails to which I reported the issue in May 2020; there has been no response (https://bugs.launchpad.net/kolla-ansible/+bug/1881222). I believe these should have been picked up in regression testing and/or prechecks; they were not picked up in at least the prechecks.

One suggestion is to remove support for projects and/or functionality that is unstable or does not have a specific majority level. Maintaining a release schedule that is consistent with the OS community should be a priority for the kolla-ansible project. Again, I appreciate your support and I do like the project.

Switching back to the original issue, has the merged change made its way into the containers such that a kolla-ansible pull will pull corrected zun containers?

Revision history for this message
Mark Goddard (mgoddard) wrote :

Per the linked document, EM is provided "While there are community members maintaining it." All I can say is, if you want to come and maintain these older branches upstream, we will be happy to assist.

I'm afraid that due to the huge number of projects supported by Kolla Ansible, some of the non-core projects may rot over time due to lack of testing. Indeed, Karbor and Searchlight have been removed from OpenStack governance during the Wallaby cycle due to lack of maintainers.

The patch merged in stable/rocky on September 4th. You will need to use the stable/rocky branch rather than a tag to get it - no releases are cut after EM. https://review.opendev.org/c/openstack/kolla/+/737198

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.