[RFE] Extend neutron-metadata-agent to support to proxy multiple external services

Bug #1905115 reported by yangjianfeng on 2020-11-21
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Wishlist
Unassigned

Bug Description

Currently, the neutron-metadata-agent only support proxy the nova-metadata-api so that the internal server instance can get metadata informations. This REF is to propose a extension for neutron-metadata-agent, in order to with the metadata channel, the internal VMs can access more external services.

The user cases:

1. The Unified monitoring platform for multiple tenants:
Suppose that a private cloud provider have multiple departments and each department has a tenant on the cloud platform. The cloud admin wish to monitor all tenant's VMs. Current neutron need to use router to achieve the scheme, the implementation method is difficult and inflexible. By extend neutron metadata channel, the monitor agent that in tenant internal VM can easily pust monitor data to the external unified monitoring platform.

2. Shared license pool:
Run commericial software in VMs is a common demand for users. To resolve the authorization issue, these commericial software providers will provide a shared license pool to cloud providre. In multiple tenant environment, as the cidr overlap issue the VMs must by router which set external gateway to access the shared liscense pool, this will limit some use scenarios, such as the pure internal network environment which needn't router service plugin.So, extend neutron metadata channel is an alternative solution.

My initial idea about how to implement it:
Support two type proxys: L7 (layer 7) proxy and L4 (layer 4) proxy, user by write neutron configuration file to tell neutron to provide which proxies and what type proxies.
For L7 proxy:
The implementation can reference the current solution that proxy the nova metadata service. And by neutron configuration file user can configure insert what headers to request for specify proxy.
For L4 proxy:
About TCP proxy, we can introduce HAproxy to implement it. About UDP proxy, I have no idea about how implement it now.

In additional to, need to add a rest API to query that the current neutron env provided which proxies for user, and extend network API, user can configure that a specified network support which proxies.

tags: added: rfe
Changed in neutron:
importance: Undecided → Wishlist
Slawek Kaplonski (slaweq) wrote :

Hi,

Is my understanding of this proposal correct that You want to propose possibility to server various services on various endpoints behind 169.254.169.254 address, like e.g.:

- http://169.254.169.254/windows-license/ - to provide some Windows licenses (I have no idea if exactly such use case makes any sense, it just an idea)

And for this monitoring, do You want to allow instances e.g. to do POST requests to http://169.254.169.254/ so instance would be able to send some metrics data to some external system?

yangjianfeng (yangjianfeng) wrote :

Hi, Slawek Kaplonski
Yep, This is roughly in line with my idea. Just a little bit different, I expect that one proxy channel occupy one port separately. So,the request url from the VMs like this:
For shared licenses pool:
- http://169.254.169.254:8081/windows-license/

For this monioring:
- http://168.254.169.254:8082/data/

Slawek Kaplonski (slaweq) wrote :

Thx for additional info. Lets discuss it on next Neutron drivers meeting which will be on Friday 04.12.2020: http://eavesdrop.openstack.org/#Neutron_drivers_Meeting - so it would be great if You could join there if there would be any additional questions.
But RFE should be discussed even if You will not be able to attend this meeting.

tags: added: rfe-triaged
removed: rfe
Slawek Kaplonski (slaweq) wrote :

We were discussing this on last drivers meeting http://eavesdrop.openstack.org/meetings/neutron_drivers/2020/neutron_drivers.2020-12-04-14.01.log.html#l-20

Finally our understanding is that this would You are describing is possible to do e.g. by using one VM which would be connected to Your isolated network and had external connectivity and would provide such proxy services which You really want. Would that work for You? If not, please share with us some more details why it wouldn't be solution for You and why You really need to solve that on the service level instead of addressing that on application architecture level.
We can then get back to the discussion about this RFE.

tags: added: rfe-postponed
removed: rfe-triaged
Brian Haley (brian-haley) wrote :

And just to add info from the Octavia perspective, Michael Johnson mentioned in the neutron channel:

"Yeah, you can do that license proxy thing with Octavia today. Put the VIP on the isolated tenant network, then add members to the LB that point to the license server(s), etc."

This would be one step above just having a VM on that isolated network.

Michael Johnson (johnsom) wrote :

Yeah, everything you want to accomplish with this RFE can be done today by adding an Octavia loadbalancer VIP to your tenant network(s) and adding members to the load balancer that point to your licensing and monitoring endpoints. It also already supports both TCP and UDP protocols.

yangjianfeng (yangjianfeng) wrote :

Hi Slawek, Brian and Michael

Thanks for your suggestions. But IMHO,These solutions is ponderous and heavyweight, whether create a extra VM to connect to external network or create a loadbalancer to do that. There are at least two reasons that result in I don't use these solutions on our cloud product:
1. If there are one thousand isolated networks in our cloud platform, so the one thousand extra VMs or loadbalancers need to be created. Without doubt, the cost is very huge. In our propaganda, the "low cost" is top priority. Opposed, The RFE proposed solution's cost almost negligible.

2. The complexity of these solutions maybe let our users go away from our cloud platform. Suppose a user create a isolated network and create a VM that installed business system (not only windows) on this network, and the user connnect the VM by VNC or SPICE client. If the user want activate the system, he must refer to our manual docs and do a lot of steps, such as: Step1, create a VM or LB; Step2, let they connect to external network; Step3, balabalabala. I already imagine that the user "shit" our system.

Additionally, IMO, we shouldn't too much attention to the technology aspect, but also more attention to the user aspect. If we provide more convenient tools to users, the more an more users will like our product. They will "wow" our system.

BTW, I read the meeting log. I want to say the list of external services should be configred in neutron-metadata-agent configration file, and neutron-server configuration file can specify that enable which external services channels. Further more, The network should extend a parameter like "external_service_channels", then by update the parameter to control the network open which "channels".

Brian Haley (brian-haley) wrote :

I would not agree with "cost almost negligible".

- Complexity added to metadata agent
  - bugs will be introduced
- Load on message bus
  - will this affect how quickly we can provision?
- Support of new technology
  - haproxy today, but haproxy2 is already out
- New API calls for users
  - Not a zero cost to learn how to program this new feature

The point I'm trying to make is we have ways to do what you want today, so you need to give a compelling reason that this is better.

And to respond to one of your points.

- "I want to say the list of external services should be configred in neutron-metadata-agent configration file, and neutron-server configuration file can specify that enable which external services channels."

I don't think any new feature should only be configurable via the config file, that isn't scalable at all, since for every change it has to be restarted, which causes control (and possibly data plan) outages.

yangjianfeng (yangjianfeng) wrote :

Hi, Brian
Maybe, I didn't describe it clearly. The "cost" I mentoned is different with your. You mean the feature that this RFE proposed will add technology "cost", but I mean it will decrease the commerice "cost". Further elaboration, it will cost more less CPU and memory than existing schem and reduce the end user's action steps. I agree with you, almost all new feature will add technology "cost", but if it will descrease the commerice "cost" for our end user, I think it is worth. IMHO, The technology "cost" is one-off, we can as for as possible decrease it by better user docs and more careful code review. But, the reduction of commerice "cost" can be copy at multiple environment, This will make our product more attractive.

Moreover, about the implementation scheme of this RFE (This is my immature idea, if this RFE is approved I will commit a spec to descibe it in detail):
I think the "extenal services" should be planned before deploy environment, maybe it have some changes in the future, be I think it isn't frequent. So, I think we just confiure this "extenal services" in configuretation file is enough.
But, the "external_service_channels" parameter, we should extend it network. By update it, the end user can control open which channels on their network, This can reduce potential security vulnerabilities.

In addition, anothor possible user scenario:
Mount shared storage system by NFS (like manila). By the new feature, the many limits about network can be broken through. (This just my brainstorm, it's not verified.)

Slawek Kaplonski (slaweq) wrote :

Hi,

TBH I agree with Brian here. I don't think we should add things like that to the Neutron-metadata-agent. Maybe You should think about some separate project which would provide service-plugin (for new API) and new agent to do what You need here. I think it would be better than adding some various things, not related with metadata to the metadata agent.

yangjianfeng (yangjianfeng) wrote :

Hi, slawek:
Yes, I agree with you, this feature is more appropriate to implement in a new agent. We can add a new agent, it's name like "neutron-external-channel-agent". the agent can replace "neutron-metadata-agent" in future. But, I don't konw whether neutron team are willing to maintain a new agent. If so, I will do my best to implement it.

yangjianfeng (yangjianfeng) wrote :

Hi
Recently, I thought of this RFE when my colleague consult me that the network issue about ironic deploy. The ironic's workflows have a inspection step, this step is mainly completed by two services: ironic-inspector and ironic-python-agent(IPA), the IPA need to post data to ironic-inspector. But, the ironic-inspector listen on the controller node's management network, the IPA work on ironic's inspection network (a neutron provider network, it's type is vlan). So, in short, my colleague require that a openstack tenant network can access openstack management network, so I immediately thought of the neutron-metadata-agent, then thought of this RFE. I mean that we can simplify the network topology when deploy openstack ironic by this RFE.
By the way, the trove and octavia's deployment also have similar network issue. For trove, the trove database instance need to post log information to swift.

[1] https://docs.openstack.org/ironic-python-agent/latest/admin/how_it_works.html
[2] https://docs.openstack.org/ironic-inspector/latest/user/workflow.html

Slawek Kaplonski (slaweq) wrote :

Thx yangjianfeng for Your comment but for me this use case sounds more like "some-new-agent" rather than new feature for metadata agent.
I'm not sure if we should extend metadata agent's functionalities in such way that it will do much more than providing metadata to the instances.
And such new agent can be done as separate project so it will not "collide" with Neutron :)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers