nova-compute in down state (servicegroup_driver=mc)

Bug #1969199 reported by Aleksey
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

Description
===========

Nova services such as:
  nova-api-os-compute
  nova-conductor
  nova-consoleauth
  nova-metadata-api
  nova-novncproxy
  nova-scheduler

 > Running on controller nodes (api's) on python3

nova-compute running on python2 on hypervisors hosts (qemu-kvm based)

IF servicegroup_driver=mc (nova.conf):
nova service-list see nova-compute nodes in down state

BUT
IF servicegroup_driver=db it works perfectly fine:
nova service-list see nova-compute nodes in up state

ALSO
IF servicegroup_driver=mc and all nova services running on python2:
nova service-list see nova-compute nodes in up state

Steps to reproduce
==================
1.Run
  nova-api-os-compute
  nova-conductor
  nova-consoleauth
  nova-metadata-api
  nova-novncproxy
  nova-scheduler on python3

Run nova-compute on python2

2. Set servicegroup_driver=mc
3. Execute nova service-list

Expected result
===============
All nova services in actual up state

Actual result
=============

nova-compute in down state

Environment
===========

nova-manage --version
19.3.2

Logs & Configs
==============
nova-compute -> rabbitmq (there is no connection errors, rabbit available from hosts)

___

nova-conductor logs:
Apr 13 08:59:08 api2 nova-conductor[1738632]: 2022-04-13 08:59:07,629.629 1 DEBUG nova.servicegroup.drivers.mc [req-26f2e991-af2b-4cde-9cd5-cf8525cbb0ef None None] Memcached_Driver: join new ServiceGroup service = <Service: host=api2, binary=nova-conductor, manager_class_name=nova.conductor.manager.ConductorManager> join /var/lib/openstack/lib/python3.6/site-packages/nova/servicegroup/drivers/mc.py:51

___

nova-api:
e27454b47c0b360072cbbf07008] Seems service compute:cmp1 is down is_up /var/lib/openstack/lib/python3.6/site-packages/nova/servicegroup/drivers/mc.py:68
__

So, I'm a little confused, because there is no Rabbit connection problems (one of possible reason with this behaviour)

I guess problem connected with:
nova-api - memcached - nova-compute

I'll appreciate any help with debug.

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

AFAIK we don't really test the memcache driver on the ServiceGroup API, are you able to see the nova-compute correctly reporting to memcache its status ?

Changed in nova:
status: New → Incomplete
Revision history for this message
Aleksey (curveds) wrote (last edit ):
Download full text (13.1 KiB)

I'm sorry for delay.

nova-compute log by req-*
As you may see nova-compute reported "join ServiceGroup" by memcached driver:
___
/var/log/nova/nova-compute.log:2022-04-19 05:22:47,422.422 582584 INFO nova.compute.manager [req-1ad2c9b0-8280-4550-b6f3-ece1595d5ecf None None] Looking for unclaimed instances stuck in BUILDING status for nodes managed by this host
/var/log/nova/nova-compute.log:2022-04-19 05:22:48,128.128 582584 DEBUG nova.compute.resource_tracker [req-1ad2c9b0-8280-4550-b6f3-ece1595d5ecf None None] Auditing locally available compute resources for cmp1302 (node: cmp1302.madpool1b.madpool.os.selectel.org) update_available_resource /usr/lib/python2.7/dist-packages/nova/compute/resource_tracker.py:731
/var/log/nova/nova-compute.log:2022-04-19 05:22:48,162.162 582584 DEBUG nova.compute.resource_tracker [req-1ad2c9b0-8280-4550-b6f3-ece1595d5ecf None None] Hypervisor/Node resource view: name=cmp1302.madpool1b.madpool.os.selectel.org free_ram=3217MB free_disk=44GB free_vcpus=2 pci_devices=[{"dev_id": "pci_0000_00_02_0", "product_id": "00b8", "dev_type": "type-PCI", "numa_node": null, "vendor_id": "1013", "label": "label_1013_00b8", "address": "0000:00:02.0"}, {"dev_id": "pci_0000_00_01_0", "product_id": "7000", "dev_type": "type-PCI", "numa_node": null, "vendor_id": "8086", "label": "label_8086_7000", "address": "0000:00:01.0"}, {"dev_id": "pci_0000_00_01_3", "product_id": "7113", "dev_type": "type-PCI", "numa_node": null, "vendor_id": "8086", "label": "label_8086_7113", "address": "0000:00:01.3"}, {"dev_id": "pci_0000_00_06_0", "product_id": "1002", "dev_type": "type-PCI", "numa_node": null, "vendor_id": "1af4", "label": "label_1af4_1002", "address": "0000:00:06.0"}, {"dev_id": "pci_0000_00_01_2", "product_id": "7020", "dev_type": "type-PCI", "numa_node": null, "vendor_id": "8086", "label": "label_8086_7020", "address": "0000:00:01.2"}, {"dev_id": "pci_0000_00_01_1", "product_id": "7010", "dev_type": "type-PCI", "numa_node": null, "vendor_id": "8086", "label": "label_8086_7010", "address": "0000:00:01.1"}, {"dev_id": "pci_0000_00_07_0", "product_id": "1005", "dev_type": "type-PCI", "numa_node": null, "vendor_id": "1af4", "label": "label_1af4_1005", "address": "0000:00:07.0"}, {"dev_id": "pci_0000_00_00_0", "product_id": "1237", "dev_type": "type-PCI", "numa_node": null, "vendor_id": "8086", "label": "label_8086_1237", "address": "0000:00:00.0"}, {"dev_id": "pci_0000_00_04_0", "product_id": "1004", "dev_type": "type-PCI", "numa_node": null, "vendor_id": "1af4", "label": "label_1af4_1004", "address": "0000:00:04.0"}, {"dev_id": "pci_0000_00_03_0", "product_id": "1000", "dev_type": "type-PCI", "numa_node": null, "vendor_id": "1af4", "label": "label_1af4_1000", "address": "0000:00:03.0"}, {"dev_id": "pci_0000_00_05_0", "product_id": "1003", "dev_type": "type-PCI", "numa_node": null, "vendor_id": "1af4", "label": "label_1af4_1003", "address": "0000:00:05.0"}] _report_hypervisor_resource_view /usr/lib/python2.7/dist-packages/nova/compute/resource_tracker.py:875
/var/log/nova/nova-compute.log:2022-04-19 05:22:48,273.273 582584 DEBUG nova.compute.resource_tracker [req-1ad2c9b0-8280-4550-b6f3-ece1595d5ecf None None] Total usable vcpus: 2, tota...

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.