Deploy Openstack Error

Bug #1460653 reported by Pavel Kholkin
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
Pavel Kholkin

Bug Description

{"build_id": "2015-05-28_20-55-26", "build_number": "478", "release_versions": {"2014.2.2-6.1": {"VERSION": {"build_id": "2015-05-28_20-55-26", "build_number": "478", "api": "1.0", "fuel-library_sha": "09822a44c2298437882062a43c8ab0bcef952524", "nailgun_sha": "4344fe68b5c93d2e7f7254076aab62aa3a612e42", "feature_groups": ["mirantis"], "openstack_version": "2014.2.2-6.1", "production": "docker", "python-fuelclient_sha": "e19f1b65792f84c4a18b5a9473f85ef3ba172fce", "astute_sha": "5d570ae5e03909182db8e284fbe6e4468c0a4e3e", "fuel-ostf_sha": "6b4ddbfd3126b77f79759721e86d395bf106b177", "release": "6.1", "fuelmain_sha": "6b5712a7197672d588801a1816f56f321cbceebd"}}}, "auth_required": true, "api": "1.0", "fuel-library_sha": "09822a44c2298437882062a43c8ab0bcef952524", "nailgun_sha": "4344fe68b5c93d2e7f7254076aab62aa3a612e42", "feature_groups": ["mirantis"], "openstack_version": "2014.2.2-6.1", "production": "docker", "python-fuelclient_sha": "e19f1b65792f84c4a18b5a9473f85ef3ba172fce", "astute_sha": "5d570ae5e03909182db8e284fbe6e4468c0a4e3e", "fuel-ostf_sha": "6b4ddbfd3126b77f79759721e86d395bf106b177", "release": "6.1", "fuelmain_sha": "6b5712a7197672d588801a1816f56f321cbceebd"}

Unsuccessfully tried to deploy three envs
1) 1 controller-cinder, 1 compute, neutron vlan, provisioning type default - image
2) 1 controller-cinder, 1 compute, nova-net vlan, provisioning type default - image
3) 1 controller, provisioning type default - classic

Error in fuel-dashboard (for 1-2 cases):

Failed to execute hook 'shell' Failed to run command cd / && fa_build_image --log-file /var/log/fuel-agent-env-1.log --data_driver nailgun_build_image --input_data '{"image_data": {"/boot": {"container": "gzip", "uri": "http://10.177.10.2:8080/targetimages/env_1_ubuntu_1404_amd64-boot.img.gz", "format": "ext2"}, "/": {"container": "gzip", "uri": "http://10.177.10.2:8080/targetimages/env_1_ubuntu_1404_amd64.img.gz", "format": "ext4"}}, "output": "/var/www/nailgun/targetimages", "repos": [{"name": "ubuntu", "section": "main universe multiverse", "uri": "http://archive.ubuntu.com/ubuntu/", "priority": null, "suite": "trusty", "type": "deb"}, {"name": "ubuntu-updates", "section": "main universe multiverse", "uri": "http://archive.ubuntu.com/ubuntu/", "priority": null, "suite": "trusty-updates", "type": "deb"}, {"name": "ubuntu-security", "section": "main universe multiverse", "uri": "http://archive.ubuntu.com/ubuntu/", "priority": null, "suite": "trusty-security", "type": "deb"}, {"name": "mos", "section": "main restricted", "uri": "http://10.177.10.2:8080/2014.2.2-6.1/ubuntu/x86_64", "priority": 1050, "suite": "mos6.1", "type": "deb"}, {"name": "mos-updates", "section": "main restricted", "uri": "http://mirror.fuel-infra.org/mos/ubuntu/", "priority": 1050, "suite": "mos6.1-updates", "type": "deb"}, {"name": "mos-security", "section": "main restricted", "uri": "http://mirror.fuel-infra.org/mos/ubuntu/", "priority": 1050, "suite": "mos6.1-security", "type": "deb"}, {"name": "mos-holdback", "section": "main restricted", "uri": "http://mirror.fuel-infra.org/mos/ubuntu/", "priority": 1100, "suite": "mos6.1-holdback", "type": "deb"}, {"name": "Auxiliary", "section": "main restricted", "uri": "http://10.177.10.2:8080/2014.2.2-6.1/ubuntu/auxiliary", "priority": 1150, "suite": "auxiliary", "type": "deb"}], "codename": "trusty"}'

Error in fuel-dashboard (for 3 case):

Too many nodes failed to provision

Diagnostic snapshot is attached.

Revision history for this message
Pavel Kholkin (pkholkin) wrote :
Changed in fuel:
milestone: none → 6.1
assignee: nobody → Fuel provisioning team (fuel-provisioning)
importance: Undecided → High
Changed in fuel:
status: New → Confirmed
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

All provisioning methods got failed due to package checksum mismatch.

classic http://paste.openstack.org/show/256221/

image http://paste.openstack.org/show/256210/

Have this ISO passed BVT for ubuntu?

From my point of view it looks like just a broken ISO.

Revision history for this message
Pavel Kholkin (pkholkin) wrote :

It was the last iso at that moment 478 6.1 and it was fully green at mos-iso-status.

Revision history for this message
Aleksandra Fedorova (bookwar) wrote :

this iso passed bvt for ubuntu with no problems

  http://jenkins-product.srt.mirantis.net:8080/job/6.1.ubuntu.bvt_2/481/console

when we retrigger bvt for iso possible changes are: change of fuel-qa code, change of ubuntu upstream packages which are downloaded from external repo

Dmitry Pyzhov (dpyzhov)
Changed in fuel:
assignee: Fuel provisioning team (fuel-provisioning) → Aleksandr Gordeev (a-gordeev)
Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

It is strange how well-tested ISO ended up with installation with broken packages. Aleksander will continue investigation. But it really looks like local issue of this environment.

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

What's the modification timestamp on Packages.gz on the master node? Was it changed after master node was deployed?

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

P.S. All debs in the paste have +mos in version number, indicating that it's packages from fwm, not upstream ubuntu.

Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

Could you check that md5 sum of your iso is Ok? Please reproduce on iso 497 or newer and give us a live environment.

Changed in fuel:
status: Confirmed → Incomplete
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

fuel-agent couldn't build image due to massive packages checksum mismatches.

for unknown reason all packages from ubuntu repo hosted on masternode were zeroed. See attachment (all broken packages are zeroed)

dmesg output from masternode indicates FS recoveries. I'm not sure it them related to zeroed packages. All recoveries happened on May 29 18:40:29-33 http://paste.openstack.org/show/259336/

I couldn't find the reason why repo was broken and what happened to masternode. May it be accidental power outage?

If masternode re-deploy helps, and step to reproduce are still unknown, bug will be closed as invalid.

Changed in fuel:
assignee: Aleksandr Gordeev (a-gordeev) → Pavel Kholkin (pkholkin)
Revision history for this message
Pavel Kholkin (pkholkin) wrote :

Thanks!
I will deploy the same iso again to try to reproduce.

Revision history for this message
Pavel Kholkin (pkholkin) wrote :

After re-deploying nodes openstack was successfully installed. The issue is moved to invalid.

Changed in fuel:
status: Incomplete → Invalid
Revision history for this message
Big Switch Networks (fuel-bugs-internal) wrote :

HI All

We used fuel-6.1-521-2015-06-08_06-13-27.iso , How ever observed same issue with 3 controller node and 2 compute node deployment.

Snapshot is attached

Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

Big Switch Networks, thanks, I've taken a look into your snapshot.

Ubuntu images were built without any errors, but took 1.5h. Deployment has been failed due to timeout expiration. Fuel expects that image will be built within 1h.

It may be unstable or low bandwidth connection to official ubuntu repos hosted at http://archive.ubuntu.com/ubuntu/

Unfortunatelly, it has nothing to deal with fuel itself.

In order to bypass that, you could just try to re-deploy of the same env again.
The building of image will continue in a background even the task was timed out.
Please be aware, that ubuntu images are going to be build on once per env basis.

Revision history for this message
Big Switch Networks (fuel-bugs-internal) wrote :

Thanks for the response. I have verified and checked the external reachability and network speeds, both are good. Moreover, connection speed to get a file from http://archive.ubuntu.com/ubuntu/ was more than 500KB/s. In addition to this, I have waited long enough to see if the background task completes, it seems nothing was happening. I can confirm this from looking at networking interface stats, that there was not much activity going on...

As always, Thanks for your support..

Revision history for this message
Big Switch Networks (fuel-bugs-internal) wrote :

latest snapshot file attached....

Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

Big Switch Networks,

env-1 failed due to network related issues:
2015-06-18 22:14:16.486 586 DEBUG fuel_agent.utils.utils [-] Got non-critical error when accessing to http://mirror.fuel-infra.org/mos/ubuntu/dists/mos6.1-updates/Release on 21 attempt: HTTPConnectionPool(host='mirror.fuel-infra.org', port=80): Max retries exceeded with url: /mos/ubuntu/dists/mos6.1-updates/Release (Caused by <class 'socket.gaierror'>: [Errno -3] Temporary failure in name resolution)

dns related issues. Please make sure that mirror.fuel-infra.org could be resolved from a master node.

logs for env-2 interrupts in the middle of process, but the process was running at the moment of time of snapshot taking. I didn't see any errors.

It looks like your network bandwidth is really enough. The another thing which may slowdown the process of image building is I/O operations. deboostrap and apt-* tools make use of a lot of unnecessary fsync() calls. Combined with filesystem mount options them could take a lot of time.

accodring to the previous snapshot, building of images took at least 1.5h
So, i highly recommend you to
1) start deployment
2) wait until it fails by timeout (1h)
3) wait at least additional 30mins
4) check that the following files appeared on a master node:
/var/www/nailgun/targetimages/env_{ENV_ID}_ubuntu_1404_amd64.img.gz
/var/www/nailgun/targetimages/env_{ENV_ID}_ubuntu_1404_amd64-boot.img.gz
5) start re-deploy of the same env with {ENV_ID}. It should not fail.

Revision history for this message
Big Switch Networks (fuel-bugs-internal) wrote :

Still see the same issue with new deployment and also followed same procedure as above

From the master node , DNS resolve and internet connectivity is working

[root@fuel ~]# nslookup mirror.fuel-infra.org
Server: 10.9.26.2
Address: 10.9.26.2#53

Non-authoritative answer:
mirror.fuel-infra.org canonical name = seed.fuel-infra.org.
Name: seed.fuel-infra.org
Address: 208.78.244.194
Name: seed.fuel-infra.org
Address: 5.43.231.47

[root@fuel ~]# cd /var/www/nailgun/targetimages/
[root@fuel targetimages]# ls
centos_65_x86_64-boot.img.gz centos_65_x86_64.yaml env_2_ubuntu_1404_amd64.img.gz env_3_ubuntu_1404_amd64-boot.img.gz env_3_ubuntu_1404_amd64.yaml
centos_65_x86_64.img.gz env_2_ubuntu_1404_amd64-boot.img.gz env_2_ubuntu_1404_amd64.yaml env_3_ubuntu_1404_amd64.img.gz
[root@fuel targetimages]# ls -ltr

Attached is the new snapshot file

Revision history for this message
Big Switch Networks (fuel-bugs-internal) wrote :

Found out that R220 Dell systems does not reboot with Fuel 6.1, rebooting always stuck at below

kvm: exiting hardware virtualization
or
sd 0:0:0:0: [sda] Synchronizing SCSI cache

R320 works good.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.