Deployment failed with nova.openstack.common.threadgroup [-] (ProgrammingError) (1146, "Table 'nova.services' doesn't exist")

Bug #1320923 reported by Andrey Sledzinskiy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
In Progress
High
Bogdan Dobrelya
4.1.x
Triaged
High
Bogdan Dobrelya
5.0.x
Triaged
High
Bogdan Dobrelya

Bug Description

Reproduced on {"build_id": "2014-05-19_01-10-31", "mirantis": "yes", "build_number": "212", "ostf_sha": "353f918197ec53a00127fd28b9151f248a2a2d30", "nailgun_sha": "ab7f7dfddadfe0e08a39693c6d33aa0250f20142", "production": "docker", "api": "1.0", "fuelmain_sha": "9de65bfdb7e8bc7c0ec6d47dfabf4a65f8a9335b", "astute_sha": "a3432e6e31ffd6f1c56386b2eb54afeacb74750b", "release": "5.0", "fuellib_sha": "b4671dcaa93d45ac219991ed3f89b512342c4777"}

Steps:
1. Create next cluster - Ubuntu, HA, Flat Nova-Network, Cinder LVM, add 3 controllers and 2 compute nodes
2. Run deployment

Actual - deployment failed with errors in nova-log

nova.openstack.common.threadgroup [-] (ProgrammingError) (1146, "Table 'nova.services' doesn't exist") 'SELECT services.created_at AS services_created_at, services.updated_at AS services_updated_at, services.deleted_at AS services_deleted_at, services.deleted AS services_deleted, services.id AS services_id, services.host AS services_host, services.`binary` AS services_binary, services.topic AS services_topic, services.report_count AS services_report_count, services.disabled AS services_disabled, services.disabled_reason AS services_disabled_reason \nFROM services \nWHERE services.deleted = %s AND services.host = %s AND services.`binary` = %s \n LIMIT %s' (0, 'node-1', 'nova-cert', 1)

Errors in puppet:

(/Stage[main]/Osnailyfacter::Cluster_ha/Nova_floating_range[10.108.1.128-10.108.1.254]) Could not evaluate: Authentication failed with response code 504

Logs are attached

Revision history for this message
Andrey Sledzinskiy (asledzinskiy) wrote :
Revision history for this message
Sergii Golovatiuk (sgolovatiuk) wrote :

Unfortunately, I was not able to reproduce the issue on 218 build. Also it's not clear why database was corrupted or not created properly.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

As far as I can see from logs, the select from nova.services was issued by nova-conductor before it was created by db_sync?
2014-05-19T14:31:49.235327 node-1 ./node-1.test.domain.local/mysqld.log:2014-05-19T14:31:49.235327+00:00 err: 140519 14:31:49 [Note] WSREP: Synchronized with group, ready for connections

2014-05-19T14:33:15.178683 node-1 ./node-1.test.domain.local/nova-conductor.log:2014-05-19T14:33:15.178683+00:00 debug: 2014-05-19 14:33:08.333 30613 ERROR nova.openstack.common.threadgroup [-] (ProgrammingError) (1146, "Table 'nova.services' doesn't exist") 'SELECT services.created_at AS services_created_at, services.updated_at AS services_updated_at, services.deleted_at AS services_deleted_at, services.deleted AS services_deleted, services.id AS services_id, services.host AS services_host, services.`binary` AS services_binary, services.topic AS services_topic, services.report_count AS services_report_count, services.disabled AS services_disabled, services.disabled_reason AS services_disabled_reason \nFROM services \nWHERE services.deleted = %s AND services.host = %s AND services.`binary` = %s \n LIMIT %s' (0, 'node-1', 'nova-conductor', 1)

2014-05-19T14:36:27.544071 node-1 ./node-1.test.domain.local/puppet-apply.log:2014-05-19T14:36:27.544071+00:00 notice: (/Stage[main]/Nova::Api/Exec[nova-db-sync]) Triggered 'refresh' from 2 events

And looks like that is an issue.

Changed in fuel:
status: New → Triaged
milestone: 5.0 → 5.1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/98392

Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Bogdan Dobrelya (bogdando)
status: Triaged → In Progress
Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

According to duplicate bug https://bugs.launchpad.net/fuel/+bug/1326384 reported today, this is a more generic problem that also happens with other OpenStack services (at least cinder).

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

The other duplicate https://bugs.launchpad.net/fuel/+bug/1326384 was also about cinder.

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

Link in comment #5 should have been https://bugs.launchpad.net/fuel/+bug/1334005

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

Aleksandr, please provide the stacktrace from cinder-volume failure. According to line 36 in cinder/manifests/volume.pp (Exec['cinder-manage db_sync'] -> Service['cinder-volume']), this should not be a problem for cinder.

Revision history for this message
Dmitry Borodaenko (angdraug) wrote :

Bogdan, at 14:33:15, nova-conductor service was started by dpkg, not by puppet. At that point, database connection is already configured in nova.conf, but "nova-manage db sync" wasn't yet run.

Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Got it. Then it looks like all of these issues are race conditions between galera cluster deployment and DB syncs from OSt service. I update the status of this issue as a dup then.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (master)

Change abandoned by Bogdan Dobrelya (<email address hidden>) on branch: master
Review: https://review.openstack.org/98392
Reason: superseded by https://review.openstack.org/#/c/86351/

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.