Required to define database/connection when running services for nova_api cell

Bug #1757472 reported by Belmiro Moreira
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Triaged
Medium
Unassigned
Pike
Triaged
Medium
Unassigned
Queens
Triaged
Medium
Unassigned

Bug Description

Services in nova_api cell fail to run if database/connection is not defined.
These services should only use api_database/connection.

In devstack database/connection is defined with the cell0 DB endpoint.
This shouldn't be required because the cell0 is set in nova_api DB.

Tags: cells
Changed in nova:
assignee: nobody → Surya Seetharaman (tssurya)
Revision history for this message
Matt Riedemann (mriedem) wrote :

Do you have some more information about the actual failure or a traceback? I think I've been surprised by this as well before and can't remember if there is a good reason for having [database]/connection set to cell0 when running nova-api.

tags: added: cells
Revision history for this message
Belmiro Moreira (moreira-belmiro-email-lists) wrote :

For example with nova-api.
If /database/connection is not specified it will try to connect to localhost:
2018-03-22 11:14:36.419 19615 WARNING oslo_db.sqlalchemy.engines [req-79c428ff-1c14-4225-9530-18e948f0798f - - - - -] SQL connection failed. -1 attempts left.: DBConnectionError: (_mysql_exceptions.Operat
ionalError) (2002, 'Can\'t connect to local MySQL server through socket \'/var/lib/mysql/mysql.sock\' (2 "No such file or directory")') (Background on this error at: http://sqlalche.me/e/e3q8)

Revision history for this message
Matt Riedemann (mriedem) wrote :

OK I'll push a devstack patch to not set [database]/connection in nova.conf and see what blows up.

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :
Download full text (7.0 KiB)

This seems to be where we blow up when [database]/connection isn't set for nova-api:

http://logs.openstack.org/46/555346/2/check/tempest-full/69cf0dc/controller/logs/screen-n-api.txt.gz#_Mar_24_00_53_19_452955

Mar 24 00:53:19.452955 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: CRITICAL nova [None req-a70dbcb7-492d-4857-87ae-b17e6fab3dce None None] Unhandled error: ArgumentError: Could not parse rfc1738 URL from string 'None'
Mar 24 00:53:19.453193 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova Traceback (most recent call last):
Mar 24 00:53:19.453423 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/usr/local/bin/nova-api-wsgi", line 54, in <module>
Mar 24 00:53:19.453619 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova application = init_application()
Mar 24 00:53:19.453902 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/opt/stack/nova/nova/api/openstack/compute/wsgi.py", line 20, in init_application
Mar 24 00:53:19.454107 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova return wsgi_app.init_application(NAME)
Mar 24 00:53:19.454297 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/opt/stack/nova/nova/api/openstack/wsgi_app.py", line 82, in init_application
Mar 24 00:53:19.454497 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova _setup_service(CONF.host, name)
Mar 24 00:53:19.454687 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/opt/stack/nova/nova/api/openstack/wsgi_app.py", line 49, in _setup_service
Mar 24 00:53:19.454880 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova ctxt, host, binary)
Mar 24 00:53:19.455077 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 184, in wrapper
Mar 24 00:53:19.455286 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova result = fn(cls, context, *args, **kwargs)
Mar 24 00:53:19.455486 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/opt/stack/nova/nova/objects/service.py", line 314, in get_by_host_and_binary
Mar 24 00:53:19.455711 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova host, binary)
Mar 24 00:53:19.456004 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/opt/stack/nova/nova/db/api.py", line 127, in service_get_by_host_and_binary
Mar 24 00:53:19.456297 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova return IMPL.service_get_by_host_and_binary(context, host, binary)
Mar 24 00:53:19.456589 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 239, in wrapped
Mar 24 00:53:19.456837 ubuntu-xenial-inap-mtl01-0003167397 <email address hidden>[5098]: ERROR nova with ctxt_mgr.reader.using(context):
Mar 24 00:53:19.457037 ubuntu-xen...

Read more...

Revision history for this message
Matt Riedemann (mriedem) wrote :

The problem is this code is not cell-aware:

https://github.com/openstack/nova/blob/2ecb99939ec15057904d1b86c4478def29e193db/nova/api/openstack/wsgi_app.py#L48

It should lookup the cell0 mapping and then use that context for finding the service record entry and creating it in the cell0 DB.

Changed in nova:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/556670

Changed in nova:
assignee: Surya Seetharaman (tssurya) → Matt Riedemann (mriedem)
status: Triaged → In Progress
Revision history for this message
Matt Riedemann (mriedem) wrote :
Download full text (6.6 KiB)

This is a failure I get in nova-conductor without having [database]/connection set:

http://logs.openstack.org/46/555346/3/check/tempest-full/d4a85cf/controller/logs/screen-n-super-cond.txt.gz?level=TRACE#_Mar_27_01_15_28_067478

Mar 27 01:15:28.067478 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service [None req-1307238b-460a-475f-8617-3ef1a8cc9454 None None] Error starting thread.: ArgumentError: Could not parse rfc1738 URL from string 'None'
Mar 27 01:15:28.067623 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service Traceback (most recent call last):
Mar 27 01:15:28.067750 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/usr/local/lib/python2.7/dist-packages/oslo_service/service.py", line 729, in run_service
Mar 27 01:15:28.067877 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service service.start()
Mar 27 01:15:28.068003 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/opt/stack/nova/nova/service.py", line 166, in start
Mar 27 01:15:28.068136 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service ctxt, self.host, self.binary)
Mar 27 01:15:28.068261 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 184, in wrapper
Mar 27 01:15:28.068405 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service result = fn(cls, context, *args, **kwargs)
Mar 27 01:15:28.068536 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/opt/stack/nova/nova/objects/service.py", line 314, in get_by_host_and_binary
Mar 27 01:15:28.068662 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service host, binary)
Mar 27 01:15:28.068787 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/opt/stack/nova/nova/db/api.py", line 127, in service_get_by_host_and_binary
Mar 27 01:15:28.068925 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service return IMPL.service_get_by_host_and_binary(context, host, binary)
Mar 27 01:15:28.069055 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 239, in wrapped
Mar 27 01:15:28.069175 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service with ctxt_mgr.reader.using(context):
Mar 27 01:15:28.069298 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
Mar 27 01:15:28.069412 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service return self.gen.next()
Mar 27 01:15:28.069532 ubuntu-xenial-ovh-bhs1-0003203353 nova-conductor[12477]: ERROR oslo_service.service File "/usr/local/lib/python2.7/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 1041, in _transaction_scope
Mar 27 01:15:28.069651 ubuntu-...

Read more...

Revision history for this message
Matt Riedemann (mriedem) wrote :

This is where nova-conductor fails to start:

https://github.com/openstack/nova/blob/ed55dcad83d5db2fa7e43fc3d5465df1550b554c/nova/service.py#L147

This gets tricky for conductor because if it's superconductor, then we don't need [database]/connection, but if it's a cell conductor, then it does. So fixing conductor startup would likely require checking to see if CONF.database.connection is set or not, and if not, assuming cell0.

Revision history for this message
Surya Seetharaman (tssurya) wrote :

However doing this means if in reality if an operator has forgotten to set CONF.database.connection in nova_cell1.conf for the cell, he/she will run in loops since it will now be pointing to cell0 without emitting a warning and hence will not figure out what is wrong? But I guess considering cell0 is the default cell it is okay to do so ? I am not sure.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.openstack.org/556670

Matt Riedemann (mriedem)
Changed in nova:
status: In Progress → Triaged
assignee: Matt Riedemann (mriedem) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.