Required to define database/connection when running services for nova_api cell

Bug #1757472 reported by Belmiro Moreira on 2018-03-21
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Unassigned
Pike
Medium
Unassigned
Queens
Medium
Unassigned

Bug Description

Services in nova_api cell fail to run if database/connection is not defined.
These services should only use api_database/connection.

In devstack database/connection is defined with the cell0 DB endpoint.
This shouldn't be required because the cell0 is set in nova_api DB.

Changed in nova:
assignee: nobody → Surya Seetharaman (tssurya)
Matt Riedemann (mriedem) wrote :

Do you have some more information about the actual failure or a traceback? I think I've been surprised by this as well before and can't remember if there is a good reason for having [database]/connection set to cell0 when running nova-api.

tags: added: cells

For example with nova-api.
If /database/connection is not specified it will try to connect to localhost:
2018-03-22 11:14:36.419 19615 WARNING oslo_db.sqlalchemy.engines [req-79c428ff-1c14-4225-9530-18e948f0798f - - - - -] SQL connection failed. -1 attempts left.: DBConnectionError: (_mysql_exceptions.Operat
ionalError) (2002, 'Can\'t connect to local MySQL server through socket \'/var/lib/mysql/mysql.sock\' (2 "No such file or directory")') (Background on this error at: http://sqlalche.me/e/e3q8)

Matt Riedemann (mriedem) wrote :

OK I'll push a devstack patch to not set [database]/connection in nova.conf and see what blows up.

Matt Riedemann (mriedem) wrote :
Matt Riedemann (mriedem) wrote :
Download full text (7.0 KiB)

This seems to be where we blow up when [database]/connection isn't set for nova-api:

http://logs.openstack.org/46/555346/2/check/tempest-full/69cf0dc/controller/logs/screen-n-api.txt.gz#_Mar_24_00_53_19_452955

Mar 24 00:53:19.452955 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: CRITICAL nova [None req-a70dbcb7-492d-4857-87ae-b17e6fab3dce None None] Unhandled error: ArgumentError: Could not parse rfc1738 URL from string 'None'
Mar 24 00:53:19.453193 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova Traceback (most recent call last):
Mar 24 00:53:19.453423 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/usr/local/bin/nova-api-wsgi", line 54, in <module>
Mar 24 00:53:19.453619 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova application = init_application()
Mar 24 00:53:19.453902 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/opt/stack/nova/nova/api/openstack/compute/wsgi.py", line 20, in init_application
Mar 24 00:53:19.454107 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova return wsgi_app.init_application(NAME)
Mar 24 00:53:19.454297 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/opt/stack/nova/nova/api/openstack/wsgi_app.py", line 82, in init_application
Mar 24 00:53:19.454497 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova _setup_service(CONF.host, name)
Mar 24 00:53:19.454687 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/opt/stack/nova/nova/api/openstack/wsgi_app.py", line 49, in _setup_service
Mar 24 00:53:19.454880 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova ctxt, host, binary)
Mar 24 00:53:19.455077 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 184, in wrapper
Mar 24 00:53:19.455286 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova result = fn(cls, context, *args, **kwargs)
Mar 24 00:53:19.455486 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/opt/stack/nova/nova/objects/service.py", line 314, in get_by_host_and_binary
Mar 24 00:53:19.455711 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova host, binary)
Mar 24 00:53:19.456004 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/opt/stack/nova/nova/db/api.py", line 127, in service_get_by_host_and_binary
Mar 24 00:53:19.456297 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova return IMPL.service_get_by_host_and_binary(context, host, binary)
Mar 24 00:53:19.456589 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 239, in wrapped
Mar 24 00:53:19.456837 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova with ctxt_mgr.reader.using(context):
Mar 24 00:53:19.457037 ubuntu-xen...

Read more...

Matt Riedemann (mriedem) wrote :

The problem is this code is not cell-aware:

https://github.com/openstack/nova/blob/2ecb99939ec15057904d1b86c4478def29e193db/nova/api/openstack/wsgi_app.py#L48

It should lookup the cell0 mapping and then use that context for finding the service record entry and creating it in the cell0 DB.

Changed in nova:
status: New → Triaged
importance: Undecided → Medium

Fix proposed to branch: master
Review: https://review.openstack.org/556670

Changed in nova:
assignee: Surya Seetharaman (tssurya) → Matt Riedemann (mriedem)
status: Triaged → In Progress
Matt Riedemann (mriedem) wrote :
Download full text (6.6 KiB)

This is a failure I get in nova-conductor without having [database]/connection set:

http://logs.openstack.org/46/555346/3/check/tempest-full/d4a85cf/controller/logs/screen-n-super-cond.txt.gz?level=TRACE#_Mar_27_01_15_28_067478

Mar 27 01:15:28.067478 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service [None req-1307238b-460a-475f-8617-3ef1a8cc9454 None None] Error starting thread.: ArgumentError: Could not parse rfc1738 URL from string 'None'
Mar 27 01:15:28.067623 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service Traceback (most recent call last):
Mar 27 01:15:28.067750 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 729, in run_service
Mar 27 01:15:28.067877 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service service.start()
Mar 27 01:15:28.068003 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/opt/stack/nova/nova/service.py", line 166, in start
Mar 27 01:15:28.068136 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service ctxt, self.host, self.binary)
Mar 27 01:15:28.068261 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 184, in wrapper
Mar 27 01:15:28.068405 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service result = fn(cls, context, *args, **kwargs)
Mar 27 01:15:28.068536 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/opt/stack/nova/nova/objects/service.py", line 314, in get_by_host_and_binary
Mar 27 01:15:28.068662 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service host, binary)
Mar 27 01:15:28.068787 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/opt/stack/nova/nova/db/api.py", line 127, in service_get_by_host_and_binary
Mar 27 01:15:28.068925 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service return IMPL.service_get_by_host_and_binary(context, host, binary)
Mar 27 01:15:28.069055 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 239, in wrapped
Mar 27 01:15:28.069175 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service with ctxt_mgr.reader.using(context):
Mar 27 01:15:28.069298 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
Mar 27 01:15:28.069412 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service return self.gen.next()
Mar 27 01:15:28.069532 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 1041, in _transaction_scope
Mar 27 01:15:28.069651 ubuntu-...

Read more...

Matt Riedemann (mriedem) wrote :

This is where nova-conductor fails to start:

https://github.com/openstack/nova/blob/ed55dcad83d5db2fa7e43fc3d5465df1550b554c/nova/service.py#L147

This gets tricky for conductor because if it's superconductor, then we don't need [database]/connection, but if it's a cell conductor, then it does. So fixing conductor startup would likely require checking to see if CONF.database.connection is set or not, and if not, assuming cell0.

Surya Seetharaman (tssurya) wrote :

However doing this means if in reality if an operator has forgotten to set CONF.database.connection in nova_cell1.conf for the cell, he/she will run in loops since it will now be pointing to cell0 without emitting a warning and hence will not figure out what is wrong? But I guess considering cell0 is the default cell it is okay to do so ? I am not sure.

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.openstack.org/556670

Matt Riedemann (mriedem) on 2018-09-29
Changed in nova:
status: In Progress → Triaged
assignee: Matt Riedemann (mriedem) → nobody
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers