Do not use fixed timeouts for image building

Bug #1453039 reported by Marek Zawadzki
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Triaged
Low
Fuel Sustaining
Mitaka
Won't Fix
Low
Fuel Python (Deprecated)
Newton
Triaged
Low
Fuel Sustaining

Bug Description

Deployment finishes with error due to timeout when creating Ubuntu images with debootstrap for nodes.
Timeout was too short in my case (3600s = 60min) - perhaps it could be configurable?

Deployment failed with the following error: http://172.16.48.59/show/369/

1) Create env (iso fuel-6.1-361):
- Ubuntu HA Neutron with VLAN segmentation
- 1x Controller, Storage - Ceph OSD
- 2x Compute, Storage - Ceph OSD
(proably any Ubuntu configuration will be affected)

2) Start deployment.

Expected result:
- deployed cluster

Actual result:
- deployment error

Suspected reason:
- deboostrap is not able to finish creating an image on fuel master before the timeout (3600s) - due to slow performance of my test system but it may happen also on more powerful systems e.g. due to network overload.

Manual FIX:
- after deployment error wait for deboostrap to finish creating the image and redeploy (the image will be found and used by fuel without need to recreate it)

Revision history for this message
Marek Zawadzki (mzawadzki-f) wrote :
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

Marek, 3600 should more than enough too create a base debian/ubuntu chroot. Please, evaluate your environment/network performance.

Changed in fuel:
milestone: none → 6.1
assignee: nobody → Fuel provisioning team (fuel-provisioning)
Changed in fuel:
importance: Undecided → High
Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

Nastya, it is a issue with slow machine and/or networking. I think that it is a Medium priority and we have to move it to 7.0

Changed in fuel:
importance: High → Medium
milestone: 6.1 → 7.0
Revision history for this message
Marek Zawadzki (mzawadzki-f) wrote :

I agree it's low/medium priority.

One more comment: if timer is based on real time (current_time - start_time) then timeout will also occure on high-performance systems after they get suspended and resumed during deployment (which may be a rare case though).

Dmitry Pyzhov (dpyzhov)
tags: added: feature
summary: - Deployment error due to timeout: [600] Error running provisioning:
- Failed to execute hook .
+ Do not use fixed timeouts for image building
Changed in fuel:
importance: Medium → Wishlist
status: New → Confirmed
tags: added: feature-image-based
Changed in fuel:
milestone: 7.0 → next
Revision history for this message
Vladimir Kozhukalov (kozhukalov) wrote :

Please make your hard drive is high-performance enough to deal with loads of fsync system calls which are caused by apt-get and debootstrap while building OS image. For example, if you use qemu + libvirt for your lab, then try cache='unsafe' which make host system ignore all fsync system calls from a guest OS.

 <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='unsafe' io='native'/>
      <source file='/var/lib/libvirt/images/fuel-master.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </disk>

Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: next → 7.0
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

re-assigning to fuel-python with 'ibp' tag.

Changed in fuel:
assignee: Fuel provisioning team (fuel-provisioning) → Fuel Python Team (fuel-python)
tags: added: ibp
Revision history for this message
Ihor Kalnytskyi (ikalnytskyi) wrote :

Yeah, I think one hour is more than enough. It usually takes 10-15 minutes to generate a provisioning image on my master node (VM, VirtualBox). In Fuel 7.0 it should be even quicker since we will mount master node's fs to mcollective (where we do generating).

Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 7.0 → 8.0
tags: removed: feature
Changed in fuel:
status: Confirmed → Triaged
importance: Wishlist → Low
Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

Does not affect user now. Marking as technical debt

tags: added: tech-debt
Dmitry Pyzhov (dpyzhov)
tags: added: area-python
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 8.0 → 9.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.