A lot of ERRORs in Ceilometer logs that it cannot obtain IP address

Bug #1576168 reported by Sergey Arkhipov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Status tracked in 10.0.x
10.0.x
Invalid
High
MOS Ceilometer

Bug Description

Detailed bug description:
I found a lot of ERROR logs for Ceilometer that it cannot obtain IP address of instance:

2016-04-28 09:03:06.254 14501 ERROR ceilometer.hardware.discovery [req-fc590e4d-3e11-48d3-98f7-ee94af3c43ec admin - - - -] Couldn't obtain IP address of instance 5a4c7651-7d3e-4bab-b8ee-fdda9be15703
...
2016-04-28 09:42:55.889 14501 ERROR ceilometer.hardware.discovery [req-fc590e4d-3e11-48d3-98f7-ee94af3c43ec admin - - - -] Couldn't obtain IP address of instance bfd24135-e279-4b7f-be7a-1aadf593db04

Meanwhile I see no problem with mentioned instances, they are up and running.

(.venv) [root@fuel work]# nova show 5a4c7651-7d3e-4bab-b8ee-fdda9be15703
+--------------------------------------+----------------------------------------------------------+
| Property | Value |
+--------------------------------------+----------------------------------------------------------+
| OS-DCF:diskConfig | AUTO |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | node-54.domain.tld |
| OS-EXT-SRV-ATTR:hypervisor_hostname | node-54.domain.tld |
| OS-EXT-SRV-ATTR:instance_name | instance-000010bc |
| OS-EXT-STS:power_state | 1 |
| OS-EXT-STS:task_state | - |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2016-04-27T17:03:41.000000 |
| OS-SRV-USG:terminated_at | - |
| accessIPv4 | |
| accessIPv6 | |
| admin_internal_net network | 192.168.108.74 |
| config_drive | |
| created | 2016-04-27T16:58:28Z |
| flavor | gig (e8c39500-0a22-4c25-afe2-f8ed15f61e3d) |
| hostId | 0731685cfe27ffd7fbfb59b032d8496150e63e660e8c2bfc11496a48 |
| id | 5a4c7651-7d3e-4bab-b8ee-fdda9be15703 |
| image | Xenial (9355f643-f72b-4e6f-83cb-124b706ec87e) |
| key_name | - |
| metadata | {} |
| name | StressCPU-7 |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| security_groups | default |
| status | ACTIVE |
| tenant_id | d41dac66d080416ebbd597e0793e5aca |
| updated | 2016-04-27T17:03:41Z |
| user_id | 2a9a8a83ab63483299b69a8f81dbdf35 |
+--------------------------------------+----------------------------------------------------------+

Steps to reproduce:
1. Boot ~20-30 VMs from image (ephemeral volume is used)
2. Keep them up and running for several hours
3. Check logs

Expected results:
1. No ERRORs from ceilometer-polling

Actual result:
1. I see mentioned log entries

Reproducibility:
100%

Workaround:
N/A

Impact:
Unknown. I do not know what it leads to

Description of the environment:
* 10 baremetal nodes:
   - CPU: 12 x 2.10 GHz
   - Disks: 2 drives (SSD - 80 GB, HDD - 931.5 GB), 1006.0 GB total
   - Memory: 2 x 16.0 GB, 32.0 GB total
   - NUMA topology: 1 NUMA node
* Node roles:
  - 1 ElasticSearch / Kibana node
  - 1 InfluxDB / Grafana node
  - 3 controllers (1 was is offline because of disk problems)
  - 5 computes
* Details:
  - OS: Mitaka on Ubuntu 14.04
  - Compute: KVM
  - Neutron with VLAN segmentation
  - Ceph RBD for volumes (Cinder)
  - Ceph RadosGW for objects (Swift API)
  - Ceph RBD for ephemeral volumes (Nova)
  - Ceph RBD for images (Glance)
* MOS 8.0, build 227

Additional information:
Diagnostic snapshot: http://mos-scale-share.mirantis.com/env14/fuel-snapshot-2016-04-28_07-46-04.tar.xzFailed to connect to server (code: 1006)

description: updated
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

MOs Ceilometer, please take a closer look and clarify the impact / importance.

tags: added: area-ceilometer
Changed in mos:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Igor Degtiarov (idegtiarov) wrote :

We have done some workaround on this issue. This not looks like Ceilometer bug, all errors were received from one compute, and our efforts to reproduce it were unsuccessful.

It seems these errors could be related with lab hardware. We are going to take a look on this issue more detailed when we will have our testing time there on scale lab.

Right now I prefer to set bug status as not a bug. Please reopen it if this issue affects you again.

Changed in mos:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.