bm node instance provisioning delay with 45 nodes

Bug #1226170 reported by Sandeep Raman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Won't Fix
Medium
Unassigned

Bug Description

Nova - 1:2013.1

While doing scale test, within a batch of 42 cartridges, few of them get provisioned in ~7 mins and few takes 35 mins.
There is totally 10 instances which spans between 30-36 minutes of provisioning time.

nova-compute log snippet of the node which is taking 36 minutes [the highest time in the batch].

In the first minute, the Claim is successful.
The next 30 minutes is wait time with the message "During sync_power_state the instance has a pending task. Skip" and eventually the provisioning gets completed.

Per blueprint, https://blueprints.launchpad.net/nova/+spec/improve-baremetal-pxe-deploy

the current is nova-baremetal-deploy-helper mounts the baremetal node's disks via iSCSI, fdisks the partitions, and dd's the updated glance image over iSCSI
while the desired approach is ramdisk fdisks the local disks, pulls specified image from glance, writes to local disk, and reboot into it

Is the current approach causing this performance bottle neck? Is there any parameter which can be tuned to better the performance?

Line 388: 2013-09-06 12:09:12.843 AUDIT nova.compute.manager [req-894f8127-4b15-4046-9919-fbd0123f3555 93aabe9ff2064688bdd070f16e6de768 40c74bd952f8470991117d54f0c03f0f] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] Starting instance...
Line 801: 2013-09-06 12:10:15.230 AUDIT nova.compute.claims [req-894f8127-4b15-4046-9919-fbd0123f3555 93aabe9ff2064688bdd070f16e6de768 40c74bd952f8470991117d54f0c03f0f] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] Attempting claim: memory 2048 MB, disk 30 GB, VCPUs 1
Line 802: 2013-09-06 12:10:15.232 AUDIT nova.compute.claims [req-894f8127-4b15-4046-9919-fbd0123f3555 93aabe9ff2064688bdd070f16e6de768 40c74bd952f8470991117d54f0c03f0f] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] Total Memory: 2048 MB, used: 512 MB
Line 803: 2013-09-06 12:10:15.233 AUDIT nova.compute.claims [req-894f8127-4b15-4046-9919-fbd0123f3555 93aabe9ff2064688bdd070f16e6de768 40c74bd952f8470991117d54f0c03f0f] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] Memory limit: 3072 MB, free: 2560 MB
Line 804: 2013-09-06 12:10:15.235 AUDIT nova.compute.claims [req-894f8127-4b15-4046-9919-fbd0123f3555 93aabe9ff2064688bdd070f16e6de768 40c74bd952f8470991117d54f0c03f0f] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] Total Disk: 80 GB, used: 0 GB
Line 805: 2013-09-06 12:10:15.236 AUDIT nova.compute.claims [req-894f8127-4b15-4046-9919-fbd0123f3555 93aabe9ff2064688bdd070f16e6de768 40c74bd952f8470991117d54f0c03f0f] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] Disk limit not specified, defaulting to unlimited
Line 806: 2013-09-06 12:10:15.238 AUDIT nova.compute.claims [req-894f8127-4b15-4046-9919-fbd0123f3555 93aabe9ff2064688bdd070f16e6de768 40c74bd952f8470991117d54f0c03f0f] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] Total CPU: 1 VCPUs, used: 0 VCPUs
Line 807: 2013-09-06 12:10:15.240 AUDIT nova.compute.claims [req-894f8127-4b15-4046-9919-fbd0123f3555 93aabe9ff2064688bdd070f16e6de768 40c74bd952f8470991117d54f0c03f0f] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] CPU limit not specified, defaulting to unlimited
Line 808: 2013-09-06 12:10:15.241 AUDIT nova.compute.claims [req-894f8127-4b15-4046-9919-fbd0123f3555 93aabe9ff2064688bdd070f16e6de768 40c74bd952f8470991117d54f0c03f0f] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] Claim successful
Line 1047: 2013-09-06 12:12:14.514 28732 INFO nova.compute.manager [-] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] During sync_power_state the instance has a pending task. Skip.
Line 1354: 2013-09-06 12:22:23.266 28732 INFO nova.compute.manager [-] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] During sync_power_state the instance has a pending task. Skip.
Line 1405: 2013-09-06 12:32:30.858 28732 INFO nova.compute.manager [-] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] During sync_power_state the instance has a pending task. Skip.
Line 1486: 2013-09-06 12:42:38.456 28732 INFO nova.compute.manager [-] [instance: 15047e3b-b4c8-4b45-906e-5c2e4650a9e2] During sync_power_state the instance has a pending task. Skip.
Line 1490: 2013-09-06 12:43:58.078 28732 INFO nova.virt.baremetal.pxe [-] PXE deploy started for instance 15047e3b-b4c8-4b45-906e-5c2e4650a9e2
Line 1493: 2013-09-06 12:44:41.641 28732 INFO nova.virt.baremetal.pxe [-] PXE deploy completed for instance 15047e3b-b4c8-4b45-906e-5c2e4650a9e2

Tags: baremetal
tags: added: baremetal
Revision history for this message
Ghe Rivero (ghe.rivero) wrote :

From an irc conversation, the images size are about 3GB in a 1Gb network with an only pxe server. In the best case scenario, the images deployment time is ~18 minutes (just the network distribution of the image) + overhead of the file injection per image. I guess the file injection is done sequentially, so that's why the images halt in a syn_power state.

So no bug, but for sure something to improve. (Remove of file injection on the way in ironic, improve image distribution via bittorrent, multicast...)

Revision history for this message
Sandeep Raman (sandeep-raman) wrote :

Looks like it is the dd command which is sequential. A sample output: I saw this dd happen one after the other for the bm nodes. Is there a way to make this as a parallel thread acitivity like a workaround?

ps aux |grep dd
root 2 0.0 0.0 0 0 ? S Sep17 0:00 [kthreadd]
root 14341 0.0 0.0 0 0 ? S< Sep17 0:00 [ib_addr]
root 24342 0.0 0.0 36092 1576 ? S 16:21 0:00 sudo nova-rootwrap /etc/nova/rootwrap.conf dd if=/var/lib/nova/instances/instance-0000006f/disk of=/dev/disk/by-path/ip-192.168.3.197:3260-iscsi-iqn-993b455c-4ef8-4469-b655-6f9b59427d2a-lun-1-part1 bs=1M oflag=direct
root 24343 0.3 0.0 35408 8188 ? S 16:21 0:00 /usr/bin/python /usr/bin/nova-rootwrap /etc/nova/rootwrap.conf dd if=/var/lib/nova/instances/instance-0000006f/disk of=/dev/disk/by-path/ip-192.168.3.197:3260-iscsi-iqn-993b455c-4ef8-4469-b655-6f9b59427d2a-lun-1-part1 bs=1M oflag=direct
root 24344 7.1 0.0 11188 1720 ? D 16:21 0:01 /bin/dd if=/var/lib/nova/instances/instance-0000006f/disk of=/dev/disk/by-path/ip-192.168.3.197:3260-iscsi-iqn-993b455c-4ef8-4469-b655-6f9b59427d2a-lun-1-part1 bs=1M oflag=direct
root 24384 0.0 0.0 8108 924 pts/3 S+ 16:22 0:00 grep --color=auto dd

Changed in nova:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Robert Collins (lifeless) wrote :

So this is kindof broad; it's under 'make nova bm faster' : It'll probably get closed when we have better performance in general.

However a first recommendation is to turn off file injection, which will get things moving substantially faster. dd's should already be parallel.

Sean Dague (sdague)
Changed in nova:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.