connection to RabbitMQ should be created directly after start of services

Bug #922067 reported by Christian Berendt
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Opinion
Wishlist
Unassigned

Bug Description

i tested starting nova-api with wrong credentials for RabbitMQ. service started without problems. the first errors regarding failed connections to the RabbitMQ service came up after trying to start a new instance. other operations, like listing instances, seems to work without RabbitMQ.

i think all necessary connections to external services should be checked during the starting procedure of a service and not initial before the first use of the external service.

for example after scheduling the creation of a new instance i can see the error in the logs, i have to fix nova.conf and afterward i have to restart the nova-api service. but looks like the scheduled instance is stucked after the restart of nova-api (now with working RabbitMQ credentials), the instance is in state "scheduling" forever (maybe that's a second bug, not sure about that).

Revision history for this message
Brian Waldon (bcwaldon) wrote :

This could be very frustrating. However, I'm not certain we can check *all* of our connections up front. Additionally, who's to say that if something like glance is down, that we won't start nova-api? Or even if the scheduler is down? Maybe the only connection we should be checking here is the rabbitmq server and nothing else.

Changed in nova:
status: New → Confirmed
importance: Undecided → Wishlist
Revision history for this message
Christian Berendt (berendt) wrote :

I think the service should start, even if one component is not working. But it would be nice if there is a error message during the starting process highlighting that a external service was not available during the start up.

Also I think the problem is not that a external service is down, the problem is if a service uses a wrong configuration of an external service and I have to fix it later I'll first note it after starting the first action using the external service.

I think no high impact for productive environments, it's just uncomfortable to setup a testing or development environment and to not have a status after a service startup if all should be working fine.

At least the connection to the queueing service and to the database service should be checked, I think both are essential for a working nova service (database service only if used by the specific nova service). All other service can be done without (big) problems.

Revision history for this message
Vish Ishaya (vishvananda) wrote : Re: [Bug 922067] connection to RabbitMQ should be created directly after start of services

The only workers that doesn't create queues on boot is nova-api. The other workers will error if they start up and can't connect to rabbit. So we could do this by just making nova-api connect to rabbit. It might be nice to implement this by giving all of the services a ping-> pong. Then nova-api can just ping all of the expected components and print a warning for any that it can't find.

On Jan 27, 2012, at 9:40 AM, Christian Berendt wrote:

> I think the service should start, even if one component is not working.
> But it would be nice if there is a error message during the starting
> process highlighting that a external service was not available during
> the start up.
>
> Also I think the problem is not that a external service is down, the
> problem is if a service uses a wrong configuration of an external
> service and I have to fix it later I'll first note it after starting the
> first action using the external service.
>
> I think no high impact for productive environments, it's just
> uncomfortable to setup a testing or development environment and to not
> have a status after a service startup if all should be working fine.
>
> At least the connection to the queueing service and to the database
> service should be checked, I think both are essential for a working nova
> service (database service only if used by the specific nova service).
> All other service can be done without (big) problems.
>
> --
> You received this bug notification because you are subscribed to
> OpenStack Compute (nova).
> https://bugs.launchpad.net/bugs/922067
>
> Title:
> connection to RabbitMQ should be created directly after start of
> services
>
> Status in OpenStack Compute (Nova):
> Confirmed
>
> Bug description:
> i tested starting nova-api with wrong credentials for RabbitMQ.
> service started without problems. the first errors regarding failed
> connections to the RabbitMQ service came up after trying to start a
> new instance. other operations, like listing instances, seems to work
> without RabbitMQ.
>
> i think all necessary connections to external services should be
> checked during the starting procedure of a service and not initial
> before the first use of the external service.
>
> for example after scheduling the creation of a new instance i can see
> the error in the logs, i have to fix nova.conf and afterward i have to
> restart the nova-api service. but looks like the scheduled instance is
> stucked after the restart of nova-api (now with working RabbitMQ
> credentials), the instance is in state "scheduling" forever (maybe
> that's a second bug, not sure about that).
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/nova/+bug/922067/+subscriptions

Revision history for this message
Russell Bryant (russellb) wrote :

I think the current behavior isn't that bad. Every other service connects to the message broker on startup because that's when it needs it. nova-api will connect only when needed, since not all requests require it. Nova is going to be completely busted if it's not up, so it's pretty hard to miss and should be monitored for anyway.

It's been over a year since this was filed, so I'm going to move into the older queue of requests (Opinion / Wishlist) in case someone would like to pick it up later.

Changed in nova:
status: Confirmed → Opinion
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.