neutron

Allow metadata agent to make calls to more than one nova_metadata_ip

Bug #1620279 reported by Slawek Kaplonski on 2016-09-05

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	neutron	Won't Fix	Undecided	Unassigned

Bug Description

Currently in config of metadata agent there is option to set IP address of nova metadata service (nova_metadata_ip).
There can be situation that there is more than one nova-api service in cluster and in such case if configured nova metadata IP will return e.g. error 500 then it will be returned to instance, but there can be situation that all other nova-api services are working fine and call to other Nova service would return proper metadata.

So proposition is to change nova_metadata_ip string option to list of IP addresses and to change metadata agent that it will try to make calls to one of configured Nova services. If response from this Nova service will not be 200, than agent will try to make call to next Nova service. If response from all Nova services will fail, then it will return lowest error code which will get from Nova (for example Nova-api-1 returned 500 and Nova-api-2 returned 404 - agent will return to VM response 404).

Tags:

Revision history for this message

Miguel Angel Ajo (mangelajo) wrote on 2016-09-05:

This makes sense, if we also round-robin a bit, we also enhance scalability.

Slawek Kaplonski (slaweq) on 2016-09-05

Changed in neutron:
assignee:	nobody → Slawek Kaplonski (slaweq)

Revision history for this message

Jakub Libosvar (libosvar) wrote on 2016-09-05:

@Miguel: I think such kind of scalability should be handled on nova-api level by introducing load-balancers or multiple handlers, not on client side. :)

I'll just add here some thoughts:
- typically when there are multiple nova apis, they are hidden behind VIP
- LB using round-robin will try different nova-apis
- guests re-try calling to metadata api if the get unsuccessful response multipletimes
- IIUC the aim of this RFE is to hide errors from client - which is in this case guest instance

Based on above, would introducing a retry mechanism in metadata agent have the same effect while we won't need to change any configuration in our precious installers?

Revision history for this message

Miguel Angel Ajo (mangelajo) wrote on 2016-09-05:

A retry + loadbalancer in front of the nova api would achieve exactly the same,
but still we need to add an/some option(s) (retry count?, incremental step backs, etc...) which
of course, could be defaulted :)

John Schwarz (jschwarz) on 2016-09-05

Changed in neutron:
importance:	Undecided → Wishlist
status:	New → Confirmed

John Schwarz (jschwarz) on 2016-09-05

Changed in neutron:
status:	Confirmed → New
importance:	Wishlist → Undecided

Revision history for this message

Gary Kotton (garyk) wrote on 2016-09-11:

I do not think that this is a bug. The nova_api IP configured in can be an IP address of a VIP of nova-api's.
We do not need to invent the wheel here

Changed in neutron:
status:	New → Won't Fix

Revision history for this message

Slawek Kaplonski (slaweq) wrote on 2016-09-15:

@Gary: I know that there is e.g. haproxy and can make round robin between several nova-api services but imagine case that one nova-api returns error 500 for some reason (not always, but only for specified requests). In such case end-user will see errors 500 in cloud-init logs for example during booting vm. Such errors could be hidden for end user in instance logs.
For private clouds and internal users it's maybe not big problem but for public cloud providers it is problem because it's not good if first think which customer can see is error 500 (even if this error has got no real impact for his service).

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-09-15: Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/370727

Changed in neutron:
status:	Won't Fix → In Progress

Revision history for this message

Brian Haley (brian-haley) wrote on 2016-09-15:

I would agree with Gary.

If a deployment is already large enough to have multiple nova-apis behind a VIP for the all the public Openstack API endpoints, they are going to just add one for port 8775 as well. And if one is generating 500 errors the proxy will take it out of rotation. That's what we did in our public cloud.

Armando Migliaccio (armando-migliaccio) on 2016-10-06

Changed in neutron:
status:	In Progress → Won't Fix
assignee:	Slawek Kaplonski (slaweq) → nobody

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-10-06: Change abandoned on neutron (master)

Change abandoned by Slawek Kaplonski (<email address hidden>) on branch: master
Review: https://review.openstack.org/370727
Reason: as there are -2 I think it is not worth to continue it

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.