cadvisor cannot get container cpu usage and network traffic

Bug #2026321 reported by ZhangLong
2
Affects Status Importance Assigned to Milestone
kolla-ansible
New
Undecided
Unassigned

Bug Description

kolla-ansible v15.1.1
grafana get data from prometheus_cadvisor, but there is no `CPU Usage per Container`
so, I run 'docker logs prometheus_cadvisor', the following:
+ sudo -E kolla_set_configs
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Writing out command to execute
INFO:__main__:Setting permission for /var/log/kolla/prometheus
++ cat /run_command
+ CMD='/opt/cadvisor --port=18080 --log_dir=/var/log/kolla/prometheus --docker_only --store_container_labels=false --disable_metrics=percpu,referenced_memory,cpu_topology,resctrl,udp,advtcp,sched,hugetlb,memory_numa,tcp,process'
+ ARGS=
+ sudo kolla_copy_cacerts
+ [[ ! -n '' ]]
+ . kolla_extend_start
++ [[ ! -d /var/log/kolla/prometheus ]]
+++ stat -c %a /var/log/kolla/prometheus
++ [[ 2755 != \7\5\5 ]]
++ chmod 755 /var/log/kolla/prometheus
+ echo 'Running command: '\''/opt/cadvisor --port=18080 --log_dir=/var/log/kolla/prometheus --docker_only --store_container_labels=false --disable_metrics=percpu,referenced_memory,cpu_topology,resctrl,udp,advtcp,sched,hugetlb,memory_numa,tcp,process'\'''
Running command: '/opt/cadvisor --port=18080 --log_dir=/var/log/kolla/prometheus --docker_only --store_container_labels=false --disable_metrics=percpu,referenced_memory,cpu_topology,resctrl,udp,advtcp,sched,hugetlb,memory_numa,tcp,process'
+ exec /opt/cadvisor --port=18080 --log_dir=/var/log/kolla/prometheus --docker_only --store_container_labels=false --disable_metrics=percpu,referenced_memory,cpu_topology,resctrl,udp,advtcp,sched,hugetlb,memory_numa,tcp,process
W0707 02:17:48.516824 7 manager.go:289] Could not configure a source for OOM detection, disabling OOM events: open /dev/kmsg: no such file or directory

I learned that to fix this bug, i need run privileged user or mount /dev/kmsg
https://github.com/google/cadvisor/issues/2150

but how to resolve this in Kolla-Ansible?

thanks.

Revision history for this message
ZhangLong (ankele) wrote :

os:rocky9.1
kolla-ansible:stable/zed
docker-images:quay.io/openstack.kolla/prometheus-cadvisor:zed-rocky-9

Revision history for this message
ZhangLong (ankele) wrote :

I fix this bug in kolla-ansible by added `privileged: true`:
cat kolla-ansible/ansible/roles/prometheus/defaults/main.yml:
...
  prometheus-cadvisor:
    container_name: "prometheus_cadvisor"
    group: "prometheus-cadvisor"
    enabled: "{{ enable_prometheus_cadvisor | bool }}"
    image: "{{ prometheus_cadvisor_image_full }}"
    volumes: "{{ prometheus_cadvisor_default_volumes + prometheus_cadvisor_extra_volumes }}"
    dimensions: "{{ prometheus_cadvisor_dimensions }}"
    privileged: true
...

cat kolla-ansible/ansible/roles/prometheus/handlers/main.yml:
...
- name: Restart prometheus-cadvisor container
  vars:
    service_name: "prometheus-cadvisor"
    service: "{{ prometheus_services[service_name] }}"
  become: true
  kolla_docker:
    action: "recreate_or_restart_container"
    common_options: "{{ docker_common_options }}"
    name: "{{ service.container_name }}"
    image: "{{ service.image }}"
    volumes: "{{ service.volumes }}"
    dimensions: "{{ service.dimensions }}"
    privileged: True
  when:
    - kolla_action != "config"
...

then, it works currently.

Revision history for this message
ZhangLong (ankele) wrote :

and cat prometheus/tasks/check-containers.ym:
...
- name: Check prometheus containers
  become: true
  kolla_docker:
    action: "compare_container"
    common_options: "{{ docker_common_options }}"
    name: "{{ item.value.container_name }}"
    image: "{{ item.value.image }}"
    pid_mode: "{{ item.value.pid_mode | default('') }}"
    volumes: "{{ item.value.volumes }}"
    dimensions: "{{ item.value.dimensions }}"
    environment: "{{ item.value.environment | default(omit) }}"
    privileged: True
  when:
    - inventory_hostname in groups[item.value.group]
    - item.value.enabled | bool
  with_dict: "{{ prometheus_services }}"
  notify:
    - "Restart {{ item.key }} container"
...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.