zun-compute does not work in rocky due to missing configuration in kolla and kolla-ansible.

Bug #1797448 reported by Andreas Merk
24
This bug affects 5 people
Affects Status Importance Assigned to Milestone
kolla
Fix Released
Medium
Radosław Piliszek
Rocky
Fix Committed
Medium
Radosław Piliszek
Stein
Fix Released
Medium
Radosław Piliszek
Train
Fix Released
Medium
Radosław Piliszek
kolla-ansible
Fix Released
High
Radosław Piliszek
Rocky
Fix Committed
High
Radosław Piliszek
Stein
Fix Released
High
Eduardo Gonzalez
Train
Fix Released
High
Radosław Piliszek

Bug Description

zun 2.1 checks regular /var/lib/docker and requires at least the directory created:
2018-10-10 13:21:50.486 8 INFO zun.container.docker.driver [-] There was a problem while executing df -B 1G /var/lib/docker
2018-10-10 13:21:50.487 8 ERROR zun.common.exception [-] Exception in string format operation, kwargs: {'code': 500, 'cmd': 'df'}: KeyError: 'error'
2018-10-10 13:21:50.487 8 ERROR zun.common.exception Traceback (most recent call last):
2018-10-10 13:21:50.487 8 ERROR zun.common.exception File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/common/exception.py", line 204, in __init__
2018-10-10 13:21:50.487 8 ERROR zun.common.exception self.message = str(self.message) % kwargs
2018-10-10 13:21:50.487 8 ERROR zun.common.exception KeyError: 'error'
2018-10-10 13:21:50.487 8 ERROR zun.common.exception
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task [-] Error during Manager.inventory_host: CommandError: The command: %(cmd)s failed on the system, due to %(error)s
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task Traceback (most recent call last):
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_service/periodic_task.py", line 220, in run_periodic_tasks
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task task(self, context)
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/compute/manager.py", line 1060, in inventory_host
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task rt.update_available_resources(context)
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/compute/compute_node_tracker.py", line 65, in update_available_resources
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task self.container_driver.get_available_resources(node)
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/container/driver.py", line 251, in get_available_resources
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task disk_total = self.get_total_disk_for_container()
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/container/docker/driver.py", line 1082, in get_total_disk_for_container
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task raise exception.CommandError(cmd='df')
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task CommandError: The command: %(cmd)s failed on the system, due to %(error)s
2018-10-10 13:21:50.488 8 ERROR oslo_service.periodic_task

Please add "mkdir -p /var/lib/docker" to the Dockerfile.j2 or give an option to change the "docker_data_root" to somewhere else.

Secondly, pciutils are missing:

2018-10-11 20:10:29.969 8 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): lspci -D -nnmm execute /var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_concurrency/processutils.py:372
2018-10-11 20:10:30.308 8 DEBUG oslo_concurrency.processutils [-] u'lspci -D -nnmm' failed. Not Retrying. execute /var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_concurrency/processutils.py:457
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task [-] Error during Manager.inventory_host: OSError: [Errno 2] No such file or directory
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task Traceback (most recent call last):
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_service/periodic_task.py", line 220, in run_periodic_tasks
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task task(self, context)
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/compute/manager.py", line 1060, in inventory_host
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task rt.update_available_resources(context)
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/compute/compute_node_tracker.py", line 66, in update_available_resources
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task self._setup_pci_tracker(context, node)
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/compute/compute_node_tracker.py", line 46, in _setup_pci_tracker
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task dev_json = self.container_driver.get_pci_resources()
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/container/driver.py", line 187, in get_pci_resources
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task return os_capability_linux.LinuxHost().get_pci_resources()
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/container/os_capability/host_capability.py", line 80, in get_pci_resources
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task output, status = utils.execute('lspci', '-D', '-nnmm')
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/common/utils.py", line 354, in execute
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task return custom_execute(*cmd, **kwargs)
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/zun/common/utils.py", line 325, in custom_execute
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task return processutils.execute(*cmd, **kwargs)
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 391, in execute
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task env=env_variables)
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/eventlet/green/subprocess.py", line 58, in __init__
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task subprocess_orig.Popen.__init__(self, args, 0, *argss, **kwds)
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/subprocess.py", line 394, in __init__
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task errread, errwrite)
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task raise child_exception
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task OSError: [Errno 2] No such file or directory
2018-10-11 20:10:30.310 8 ERROR oslo_service.periodic_task
2018-10-11 20:10:50.452 8 DEBUG oslo_service.periodic_task [req-cc4ba0c9-b0f6-43a4-906e-5cd0d13fdf75 - - - - -] Running periodic task ZunServicePeriodicTasks.update_zun_service run_periodic_tasks /var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_service/periodic_task.py:215

Please add the package "pciutils" to the list of required packages for zun-base.

Andreas Merk (amerk)
description: updated
Revision history for this message
Andreas Merk (amerk) wrote :

I got now zun-compute quite good running, but it needs following adjustments in kolla:
Packages in the container:
pciutils
ceph-common

Folder to be created in the container:
/var/lib/docker

Open is how we deal with udev. Currently I have ceph-common also on the host installed to have the udev rules in /lib/udev/rules.d/50-rbd.rules. Theoretical this can als be done from the container, but then we need a stripped down udev only for the block devices.
Without udev the rbd devices do not show up in the expected /dev/rbd/volume location. They just appear under /dev as rbd0, rbd1. Changing the location in the cinder client could be an option too.

Revision history for this message
Andreas Merk (amerk) wrote :

Info: I will add the kolla-ansible fixes later.

affects: kolla → kolla-ansible
affects: kolla-ansible → kolla
Revision history for this message
Andreas Merk (amerk) wrote :

Release with the issues is: rocky

summary: - zun requires further configuration settings
+ zun-compute does not work in rocky due to missing configuration in kolla
+ and kolla-ansible.
Revision history for this message
Andreas Merk (amerk) wrote :

kolla-ansible missing things (not all have solution yet):
We require a similar configuration like nova-compute. Please add following lines in "ansible/roles/zun/defaults/main.yml":
zun_services:
  zun-compute:
    volumes:
      - "/lib/modules:/lib/modules:ro"
      - "/dev:/dev"
      - "{% if enable_iscsid | bool %}iscsi_info:/etc/iscsi{% endif %}"
Last line is not tested yet. But could be required if another storage than ceph is used.
Further we need in the container under /etc/ceph:
ceph.conf
ceph.client.cinder.keyring

Additional we have to deal ceph features which are behind in the linux kernel, compared to libceph in kvm/qemu. I had to disable ceph rbd feature by setting "rbd_default_features = 7" in the /etc/ceph/ceph.conf in the cinder-volume container. But this has also now a negative impact for all created ceph volumes.

Revision history for this message
Andreas Merk (amerk) wrote :

With this changes I got zun up and running under kolla-rocky. Only thing I'm a bit disappointed is the zun-ui. It covers everything, but the information presentation is quite bad compared to nova.

Revision history for this message
hongbin (hongbin034) wrote :

@Andreas,

Thanks for the detailed information. Re. the lspci issue. Zun uses 'lspci' for listing the VF (SRIOV feature). If you are not using SRIOV, simply disregard the error.

Re. the '/var/lib/docker' issue, Zun uses it to detect the total disk size available for the docker daemon. I am not sure how to deal with it in a containerized environment (i.e. kolla) but would be open for proposals for that.

Re. the Zun UI issue, I would be happy to improve that. If you can be specific about what can be improved, I will look into that.

Revision history for this message
Gaëtan Trellu (goldyfruit) wrote :

Just a quick addition, /var/lib/zun needs to be created on the host and mounted inside zun_compute container.

zun_services:
  zun-compute:
    volumes:
      - "/lib/modules:/lib/modules:ro"
      - "/dev:/dev"
      - "/var/lib/zun:/var/lib/zun:rw"

Permissions should be set on /var/lib/zun to allow zun user to create objects from the zun_compute container.

Revision history for this message
Gaëtan Trellu (goldyfruit) wrote :

Made a typo above, replace:
       - "/var/lib/zun:/var/lib/zun:rw"
By:
       - "/var/lib/zun:/var/lib/zun:shared"

Revision history for this message
Mark Goddard (mgoddard) wrote :

zun-compute now contains pciutils, but not ceph-common.

Changed in kolla:
importance: Undecided → Medium
Changed in kolla-ansible:
importance: Undecided → Medium
Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

there seem to be multiple issues reported in this bug report, need to test storage in zun for sure then; adding myself to the bug loop

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Zun reports "ceph-common package is not installed" when using ceph volumes - bug triaged.

Changed in kolla:
assignee: nobody → Radosław Piliszek (yoctozepto)
Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

Backported pciutils addition to Rocky for completeness.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla (stable/rocky)

Related fix proposed to branch: stable/rocky
Review: https://review.opendev.org/682436

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/682437

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

after ceph-common in zun-compute:

did not load config file, using default settings.
2019-09-16 21:13:59.914 7fe5060ddb00 -1 Errors while parsing config file!
2019-09-16 21:13:59.914 7fe5060ddb00 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
2019-09-16 21:13:59.914 7fe5060ddb00 -1 parse_file: cannot open /root/.ceph/ceph.conf: (2) No such file or directory
2019-09-16 21:13:59.914 7fe5060ddb00 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
2019-09-16 21:13:59.915 7fe5060ddb00 -1 Errors while parsing config file!
2019-09-16 21:13:59.915 7fe5060ddb00 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
2019-09-16 21:13:59.915 7fe5060ddb00 -1 parse_file: cannot open /root/.ceph/ceph.conf: (2) No such file or directory
2019-09-16 21:13:59.915 7fe5060ddb00 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
2019-09-16 21:13:59.962 7fe5060ddb00 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.cinder.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
modinfo: ERROR: Module alias rbd not found.
rbd: failed to load rbd kernel module (1)
rbd: sysfs write failed
rbd: map failed: (2) No such file or directory

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla (stable/rocky)

Reviewed: https://review.opendev.org/682436
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=8b6a7845f331a39e6eaef899b0d2a3ccc7336379
Submitter: Zuul
Branch: stable/rocky

commit 8b6a7845f331a39e6eaef899b0d2a3ccc7336379
Author: Radosław Piliszek <email address hidden>
Date: Fri Aug 16 15:13:03 2019 +0200

    Add pciutils package to zun-compute

    It runs lspci and has this requirement in its bindep.

    CentOS images seem to have it installed but Ubuntu not.
    Ensure pciutils package is always installed.

    zun-compute seemed to work nonetheless but generated errors
    about being unable to take host inventory.
    (now we know it would fail for SRIOV)

    Change adapted for Rocky by ignoring cyborg (not present).

    Related-bug: #1797448
    Change-Id: Iebc1f2c05c1f57c2b6f98ca9772f1ae9d9a420fa
    Signed-off-by: Radosław Piliszek <email address hidden>
    (cherry picked from commit c323c5be7fd4ebcc71ca4e10a5c15b9ab1f29ba6)
    (cherry picked from commit 904f1fcbd7b1bf133547980eed7807b06a51ef3b)

tags: added: in-stable-rocky
Changed in kolla-ansible:
assignee: nobody → Radosław Piliszek (yoctozepto)
Changed in kolla:
status: New → In Progress
Changed in kolla-ansible:
status: New → In Progress
Changed in kolla:
milestone: none → 9.0.0
Changed in kolla-ansible:
milestone: none → 9.0.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/682900

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

(note copied from review)

There are two pending ceph issues, both arise from dependency on kernel's rbd module:

- host's udev has no idea where to link rbd device unless it also has ceph-common installed (needs udev rule and a helper program with its accompanying libs - no clean solution except installing it on host)
- Cinder creates rbd volumes using the latest features while the kernel client most likely lags behind and needs a downgrade which Cinder does not currently support afaik (unless we downgrade every but this makes it less performant for the rest of the stack)

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

At least iSCSI works out-of-the-box.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla (master)

Reviewed: https://review.opendev.org/682437
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=ed29cb61b808187288479a050053ca8604b9a123
Submitter: Zuul
Branch: master

commit ed29cb61b808187288479a050053ca8604b9a123
Author: Radosław Piliszek <email address hidden>
Date: Mon Sep 16 19:35:47 2019 +0200

    Add volume-related packages to zun-compute image

    This lets Zun containers use Cinder volumes.

    Zun-compute formats attached block devices.
    ext* and xfs are the most common filesystems and have tools
    included in other relevant images.

    In case of Ceph ceph-common is required for rbd attachment.

    iSCSI is supported via packages installed in base.

    Change-Id: Ib094ae8fcc8468949b3cd162a1cedba3fdfd3a47
    Related-bug: #1797448
    Signed-off-by: Radosław Piliszek <email address hidden>

Revision history for this message
Mark Goddard (mgoddard) wrote :

Looks like this one is fixed now?

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

I will handle this.

Revision history for this message
Radosław Piliszek (yoctozepto) wrote :
Revision history for this message
Radosław Piliszek (yoctozepto) wrote :

kolla patches (missing pciutils): https://review.opendev.org/#/q/Iebc1f2c05c1f57c2b6f98ca9772f1ae9d9a420fa

kolla-ansible patches (missing /var/lib/docker):
(stein/master) https://review.opendev.org/617341
(rocky) pending

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to kolla-ansible (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/689654

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to kolla-ansible (stable/rocky)

Reviewed: https://review.opendev.org/689654
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=832791e6017167cee79631e026d07511a77d5dc0
Submitter: Zuul
Branch: stable/rocky

commit 832791e6017167cee79631e026d07511a77d5dc0
Author: Radosław Piliszek <email address hidden>
Date: Sun Oct 20 18:51:21 2019 +0200

    [rocky-only] Fix Zun Compute access to /var/lib/docker

    Stein and later were already patched by [1].

    [1] https://review.opendev.org/617341

    Change-Id: I6a95b2bdec268b8f54dd99eeae7c30bb57b01f0d
    Closes-bug: #1797448

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (master)

Reviewed: https://review.opendev.org/682900
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=2447c161b8887e130ff2586dee2413c2850572a6
Submitter: Zuul
Branch: master

commit 2447c161b8887e130ff2586dee2413c2850572a6
Author: Radosław Piliszek <email address hidden>
Date: Tue Sep 17 13:47:40 2019 +0200

    Zun: add Cinder (volume) iSCSI support

    This makes iSCSI with Zun and Cinder work out-of-the-box.
    (LVM included)

    Change-Id: I2c4aa666ff9f952e32c450347d31ddbf2e58209e
    Related-bug: #1797448
    Closes-bug: #1848933

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/690614

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to kolla-ansible (master)

Reviewed: https://review.opendev.org/690614
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=01135e0f4d330c3bb7829fb1f1ccfee00fa0a973
Submitter: Zuul
Branch: master

commit 01135e0f4d330c3bb7829fb1f1ccfee00fa0a973
Author: Radosław Piliszek <email address hidden>
Date: Wed Oct 23 17:27:34 2019 +0200

    Zun: fix Cinder (volume) iSCSI support

    Fixes /dev volume, missed in the previous commit [1].
    iscsid was installing the devices in /dev but zun-compute
    could not see them due to the emulated /dev volume.

    [1] I2c4aa666ff9f952e32c450347d31ddbf2e58209e

    Change-Id: Ic72c910b92765b9a6f59224728e9c4ea2482bc5a
    Related-bug: #1797448
    Related-bug: #1848933

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to kolla-ansible (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/703274

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/kolla-ansible 7.2.0

This issue was fixed in the openstack/kolla-ansible 7.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.