Containers: nova.scheduler.manager from the Placement API. This could be due to insufficient resources or a temporary occurrence as compute nodes start up.
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
High
|
zhipeng liu |
Bug Description
Brief Description
-----------------
On a 2+2 system, using the described ISO image, very often you will hit the following error message while trying to launch a VM:
nova.scheduler.
Severity
--------
<Major: System/Feature is usable but degraded>, StarlingX is up and doesn't show any system alarm or issue using "system host-list", however you aren't able to launch a VM.
Steps to Reproduce
------------------
1.- Follow the steps to setup a 2+2 environment listed here: https:/
Except for building the image, you should use the ISO described later here.
2.- Using the following network cards / HW configuration listed in this document:
https:/
Listed as 2+2 Config 4
3.- While you are at the active controller, after finished the proper setup, create a flavor, an image and try to launch a VM:
export OS_CLOUD=
openstack image create --container-format bare --disk-format qcow2 --file cirros-
cirros
openstack flavor create --ram 512 --disk 1 --vcpus 1 my_tiny
openstack flavor list
openstack flavor set $UUID_my_tiny --property hw:mem_
use the mem_page_size parameter in your flavor
openstack server create --image cirros --flavor my_tiny --network public-net0 vm5
Expected Behavior
------------------
You should see 100% of completion with no errors, and the VM up and running.
Actual Behavior
----------------
after executing "openstack server create", the system application start working in create a VM, however if you use "openstack server list" you will see that the VM has been created with an ERROR.
Also you will see that the "network" field is empty even do you indicate the network to be used in your server create command.
Reproducibility
---------------
<Reproducible/
State if the issue is 3 to 4 times in 10 fresh ISO installs. And at this point it's very random, can occur 3 times in a row, or you can't see until your 10th installation.
System Configuration
-------
2 + 2 with the Hardware configuration described in the following document:
https:/
Listed as 2+2 Config 4
Compute-0 : Intel® Server System R2208WFTZS
MGMT - Baseboard X722 10G Ethernet Card (eno1)
DATA - Intel X520 DA SPF+ 10G Ethernet Card
Compute-1: Intel® Wolf Pass 1U 8x2.5in HDD Skylake SP - R1208WFTYS-IDD
MGMT - Baseboard X722 10G Ethernet Card (eno2)
DATA - Mellanox Cx4
Controller-0: Intel® Wolf Pass 1U 8x2.5in HDD Skylake SP - R1208WFTYS-IDD
OAM - Baseboard X722 10G Ethernet Card (eno1)
MGMT - Baseboard X722 10G Ethernet Card (eno2)
Controller-1: Intel® Server System R2208WFTZS
OAM - Baseboard X722 10G Ethernet Card (eno1)
MGMT - Baseboard X722 10G Ethernet Card (eno2)
Branch/Pull Time/Commit
-------
SW_VERSION="19.01"
BUILD_TARGET=
BUILD_TYPE=
BUILD_ID="n/a"
JOB="n/a"
BUILD_BY="slin14"
BUILD_NUMBER="n/a"
BUILD_HOST=""
BUILD_DATE=
BUILD_DIR="/"
WRS_SRC_
WRS_GIT_
CGCS_SRC_
CGCS_GIT_
Last Pass
---------
The same ISO was behaving well for PVMs creation using the same hardware, but different networks configuration:
Please refer to the document listed here, and Configurations 1 and 2, there where working with no issues.
https:/
Timestamp/Logs
--------------
1.-
[wrsroot@
+----+-
| id | hostname | personality | administrative | operational | availability |
+----+-
| 1 | controller-0 | controller | unlocked | enabled | available |
| 2 | controller-1 | controller | unlocked | enabled | available |
| 3 | compute-0 | worker | unlocked | enabled | available |
| 4 | compute-1 | worker | unlocked | enabled | available |
+----+-
2.-
controller-0:~# export OS_CLOUD=
controller-0:~# openstack server create --image cirros --flavor my_tiny --network public-net0 vm5
+------
| Field | Value |
+------
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-
| OS-EXT-STS:vm_state | building |
| OS-SRV-
| OS-SRV-
| accessIPv4 | |
| accessIPv6 | |
| addresses | |
| adminPass | v45fxLKG9iAW |
| config_drive | |
| created | 2019-05-
| flavor | my_tiny (85b97244-
| hostId | |
| id | 263c81b4-
| image | cirros (662fbd36-
| key_name | None |
| name | vm5 |
| progress | 0 |
| project_id | 80fdf56ba59f4ae
| properties | |
| security_groups | name='default' |
| status | BUILD |
| updated | 2019-05-
| user_id | 6c1c3b373236464
| volumes_attached | |
+------
3.-
controller-0:~# openstack server list
+------
| ID | Name | Status | Networks | Image | Flavor |
+------
| 263c81b4-
| 043292da-
| 2611a52c-
+------
4.-
controller-0:~# kubectl get pod -n openstack | grep nova
nova-api-
nova-api-
nova-api-
nova-api-
nova-api-
nova-bootstrap-
nova-cell-
nova-compute-
nova-compute-
nova-conductor-
nova-conductor-
nova-consoleaut
nova-consoleaut
nova-db-init-9hqtd 0/3 Completed 0 5h12m
nova-db-sync-wt25b 0/1 Completed 0 5h12m
nova-ks-
nova-ks-
nova-ks-user-mh5mw 0/1 Completed 2 5h12m
nova-novncproxy
nova-novncproxy
nova-placement-
nova-placement-
nova-rabbit-
nova-scheduler-
nova-scheduler-
nova-storage-
5.-
controller-0:~# kubectl logs nova-scheduler-
+ exec nova-scheduler --config-file /etc/nova/nova.conf
Deprecated: Option "idle_timeout" from group "database" is deprecated. Use option "connection_
Deprecated: Option "idle_timeout" from group "api_database" is deprecated. Use option "connection_
2019-05-03 16:03:55,279.279 1 WARNING nova.scheduler.
2019-05-03 16:03:55,298.298 17 INFO nova.service [-] Starting scheduler node (version 19.1.0)
2019-05-03 16:03:55,300.300 18 INFO nova.service [-] Starting scheduler node (version 19.1.0)
2019-05-03 16:03:55,301.301 19 INFO nova.service [-] Starting scheduler node (version 19.1.0)
<snip>.....
2019-05-03 16:06:09,809.809 71 INFO nova.scheduler.
2019-05-03 16:06:09,809.809 89 INFO nova.scheduler.
2019-05-03 16:06:09,809.809 71 INFO nova.scheduler.
2019-05-03 17:02:59,874.874 26 INFO nova.scheduler.
<log end>...
Test Activity
-------------
Fature Testing OVS - DPDK
Changed in starlingx: | |
assignee: | Bruce Jones (brucej) → Cindy Xie (xxie1) |
Changed in starlingx: | |
importance: | Undecided → High |
tags: | added: stx.2.0 |
tags: | added: stx.retestneeded |
Changed in starlingx: | |
status: | Incomplete → Fix Released |
status: | Fix Released → Fix Committed |
status: | Fix Committed → In Progress |
controller-0:~# kubectl describe node compute-0 .io/arch= amd64
beta. kubernetes. io/os=linux
kubernete s.io/hostname= compute- 0
openstack -compute- node=enabled
openvswit ch=enabled
sriov= enabled alpha.kubernete s.io/cri- socket: /var/run/ dockershim. sock
node. alpha.kubernete s.io/ttl: 0
projectca lico.org/ IPv4Address: 192.168.206.50/24
volumes. kubernetes. io/controller- managed- attach- detach: true cientDisk kubelet has sufficient disk space available cientMemory kubelet has sufficient memory available kPressure kubelet has no disk pressure cientPID kubelet has sufficient PID available 7ae2289ba08b7b5 94 0786-E811- 906E-0016356626 3E b277-4bd1- b256-9d537a51b6 74 957.1.3. el7.1.tis. x86_64
Name: compute-0
Roles: <none>
Labels: beta.kubernetes
Annotations: kubeadm.
CreationTimestamp: Fri, 03 May 2019 15:31:54 +0000
Taints: <none>
Unschedulable: false
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Fri, 03 May 2019 21:42:50 +0000 Fri, 03 May 2019 15:31:54 +0000 KubeletHasSuffi
MemoryPressure False Fri, 03 May 2019 21:42:50 +0000 Fri, 03 May 2019 15:31:54 +0000 KubeletHasSuffi
DiskPressure False Fri, 03 May 2019 21:42:50 +0000 Fri, 03 May 2019 15:31:54 +0000 KubeletHasNoDis
PIDPressure False Fri, 03 May 2019 21:42:50 +0000 Fri, 03 May 2019 15:31:54 +0000 KubeletHasSuffi
Ready True Fri, 03 May 2019 21:42:50 +0000 Fri, 03 May 2019 15:32:06 +0000 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 10.10.57.59
Hostname: compute-0
Capacity:
cpu: 112
ephemeral-storage: 20027216Ki
memory: 196618812Ki
pods: 110
Allocatable:
cpu: 112
ephemeral-storage: 18457082236
memory: 196516412Ki
pods: 110
System Info:
Machine ID: 318e5af8729f437
System UUID: 00D7627F-
Boot ID: c57e6e3c-
Kernel Version: 3.10.0-
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://18.3.1
Kubelet Version: v1.12.3
Kube-Proxy Version: v1.12.3
PodCIDR: 172.16.3.0/24
Non-terminated Pods: (10 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system calico-node-rkdfn 250m (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system kube-...