nova_virtlogd container fails with permission denied

Bug #1819482 reported by Michele Baldessari on 2019-03-11
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
High
Michele Baldessari

Bug Description

 On a rhel8 os + rhel8 containers we seem to be getting the following error from kolla in nova_virtlogd:

[stack@win1 ~]$ sudo podman logs nova_virtlogd
+sudo -E kolla_set_configs OSError: [Errno 30] Read-only file system

The reason is the following nova_virtlogd bind mount:
 ... - /etc/libvirt/qemu:/etc/libvirt/qemu:ro

Seems kolla config has a some empty folders that are then copied to a :ro fs and we fail: ()[root@win1 /]$ find var/lib/kolla/config_files/src/
var/lib/kolla/config_files/src/
var/lib/kolla/config_files/src/etc
var/lib/kolla/config_files/src/etc/libvirt
var/lib/kolla/config_files/src/etc/libvirt/qemu
var/lib/kolla/config_files/src/etc/libvirt/qemu/networks
var/lib/kolla/config_files/src/etc/libvirt/qemu/networks/autostart

It seems the empty folders are copied in because in https://github.com/openstack/puppet-tripleo/blob/master/manifests/profile/base/nova/libvirt.pp#L58-L63 we remove the default.xml and that likely changes the timestamps of those folders and so they get copied in /var/lib/config-data/puppet-generated and kolla will want to copy those dirs but will fail due to the RO bind mount

Note that we cannot just move the mount to read-only because selinux denies writing to etc inside the container

Changed in tripleo:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Michele Baldessari (michele)
milestone: none → stein-3
Bogdan Dobrelya (bogdando) wrote :

I think any file resources for puppet what rely on before => Service['libvirt'], should be nooped via a custom tag added and filtered by tht. Instead, we should manage such files via host prep tasks in tht directly. That would help to no more violate containers writing to /etc and "heal" the leaked abstraction as well, that is when containers attempt to do anything related to the disabled services management and packages installation. We don't do these in containers, neither shall we for the orphaned dependencies for those.

Bogdan Dobrelya (bogdando) wrote :

More generally we should make tht nooping anything puppet'ish what is like before/after => Service/Package['foo'] and do that via ansible-driven host prep tasks in tht

Dan Prince (dan-prince) wrote :

Rather than get more complex with Puppet no-ops we might consider a more exposing a more generic "excludes" option with container-puppet.py. (docker-puppet.py). That way we could pass in an array of files to container-puppet.py for the nova-libvirt container and have those skipped very cleanly.

Fix proposed to branch: master
Review: https://review.openstack.org/642549

Changed in tripleo:
status: Triaged → In Progress

Reviewed: https://review.openstack.org/642549
Committed: https://git.openstack.org/cgit/openstack/puppet-tripleo/commit/?id=7643761601a89823efee9bbbd92f2d05106fc129
Submitter: Zuul
Branch: master

commit 7643761601a89823efee9bbbd92f2d05106fc129
Author: Michele Baldessari <email address hidden>
Date: Mon Mar 11 20:14:09 2019 +0100

    Fix kolla permissions errors inside nova_virtlogd

    [stack@win1 ~]$ sudo podman logs nova_virtlogd
    +sudo -E kolla_set_configs OSError: [Errno 30] Read-only file system

    The reason is the following nova_virtlogd bind mount:
     ... - /etc/libvirt/qemu:/etc/libvirt/qemu:ro

    Seems kolla config has a some empty folders that are then copied to a
    :ro fs and we fail:
    ()[root@win1 /]$ find
    var/lib/kolla/config_files/src/
    var/lib/kolla/config_files/src/
    var/lib/kolla/config_files/src/etc
    var/lib/kolla/config_files/src/etc/libvirt
    var/lib/kolla/config_files/src/etc/libvirt/qemu
    var/lib/kolla/config_files/src/etc/libvirt/qemu/networks
    var/lib/kolla/config_files/src/etc/libvirt/qemu/networks/autostart

    Since the above empty folders are due to the timestamp
    change caused by the default.xml file removal, let's do
    those on BM only.

    Hopefully compute folks can propose a more definitive fix that
    takes that default.xml removal logic into host-prep-task (if it
    is still needed)

    Tested on a RHEL8 OS/Containers combo and the error is gone:
    [root@overcloud-novacompute-0 ~]# podman logs nova_virtlogd 2>&1| tail -n5
    ++ chmod 755 /var/log/kolla/libvirt
    ++ chmod 644 /var/log/kolla/libvirt/libvirtd.log
    Running command: '/usr/sbin/virtlogd --config /etc/libvirt/virtlogd.conf'
    + echo 'Running command: '\''/usr/sbin/virtlogd --config /etc/libvirt/virtlogd.conf'\'''
    + exec /usr/sbin/virtlogd --config /etc/libvirt/virtlogd.conf

    Change-Id: I629e9e37aff9a1610df874b46c7a5b1eedd3e374
    Closes-Bug: #1819482

Changed in tripleo:
status: In Progress → Fix Released

This issue was fixed in the openstack/puppet-tripleo 10.4.0 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers