[RFE] add API for neutron debug tool "probe"

Bug #1830014 reported by LIU Yulong
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Won't Fix
Wishlist
LIU Yulong

Bug Description

Recently, due to this bug:
https://bugs.launchpad.net/neutron/+bug/1821912
We noticed that sometimes the guest OS is not fully UP, but test case is trying to login it. A simple idea is to ping it first, then try to login. So we hope to find a way for tempest to verify the neutron port link state. In high probability, the DB resource state is not reliable. We need an independent mechanism to check the VM network status. Because tempest is "blackbox" test, it can run in any host, we can not use the current resources under the existing mechanism, such as qdhcp-namepace or qrouter-namepace to do such check.

Then this RFE is up. We have neutron-debug tool which include a "probe" resource in the agent side.
https://docs.openstack.org/neutron/latest/cli/neutron-debug.html
We could add some API to neutron, and let the proper agent to add such "probe" for us.
In agent side, it will be a general agent extension, you can enable it to the ovs-agent, L3-agent or DHCP-agent.
Once you have such "probe" resource in the agent side, then you can run any command in it.
This will be useful for neutron CI to check the VM link state.

So a basic workflow will be:
1. neutron tempest create router and connected to one subnet (network-1)
2. neutron tempest create one VM
3. neutron tempest create one floating IP and bind it to the VM-1 port
4. create a "probe" for network-1 via neutron API
5. ping the VM port until reachable in the "probe" namespace
6. ssh the VM by floating IP
7. do the next step

One more thing, we now have set the "neutron-debug" tool as deprecated:
https://bugs.launchpad.net/neutron/+bug/1583700
But we can remain that "probe" mechanism.

Tags: rfe
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

1. Such probe may be also failure prone. In some operating systems network is configured (so ping will work) but there is a lot of time still needed to have ssh confiured properly. So in such case this probe will not help You at all.

2. What do You mean by "general agent extension"? How You want to use it in various agents? Where You want to plug probe e.g. with ovs-agent?

3. Can it be useful for anyone else than neutron CI? If no, maybe we should think about different solution for this specific problem (failing tempest tests) instead of revive old, deprecated tool which wasn't in fact maintained for long time - at least I don't remember any patch or tests for this tool :/

Revision history for this message
LIU Yulong (dragon889) wrote :

Slawek,

We can not say: a guest is pingable, then it is surely loginable. We have tons of things need to confirm, like the ssh-key injection, sshd running state, security-group rules and so on. However, this is a general network traffic state checking mechanism, which is not related to any specific scenarios. Neutron upstream CI can use such mechanism during the test, but it does not ensure all the resources that the test rely on is properly set.

This can be a virtual machine state detection mechanism, it can be used in some scenarios else, not just CI. If the user or operators want to confirm a VM traffic state, such API will significantly get efficiency improvement.

For that "general agent extension", I just not quite sure how to implement it, but maybe it is a general base class.

Thanks.

Miguel Lavalle (minsel)
tags: added: rfe
LIU Yulong (dragon889)
Changed in neutron:
status: New → Opinion
Revision history for this message
Miguel Lavalle (minsel) wrote :

Where would that probe namespace live?
How would it be connected to the correct network?

Revision history for this message
LIU Yulong (dragon889) wrote :

Probe namespace can live where it has a L2-agent, then a port from this network is needed to create the required device. How to choice a L2-agent host can be in two ways or both:
1. let the admin user choice one;
2. add a random scheduler for this.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron-specs (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/662541

LIU Yulong (dragon889)
Changed in neutron:
assignee: nobody → LIU Yulong (dragon889)
status: Opinion → New
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Liu, are You still planning to work on this?

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

As it was discussed during the PTG, lets review spec and continue discussion about that here until it will be triaged and ready for drivers meeting.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Let's finally discuss it in the next drivers meeting.

tags: added: rfe-triaged
removed: rfe
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

We discussed that rfe again on the drivers meeting today: https://meetings.opendev.org/meetings/neutron_drivers/2021/neutron_drivers.2021-09-10-14.05.log.html#l-18
As use case originally given in the spec (debugging CI) isn't really issue anymore, and as we see that implementation of that proposal would be pretty complex, we decided to decline that RFE.

Changed in neutron:
status: New → Won't Fix
status: Won't Fix → Confirmed
importance: Undecided → Wishlist
tags: added: rfe
removed: rfe-triaged
Changed in neutron:
status: Confirmed → Won't Fix
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-specs (master)

Change abandoned by "liuyulong <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/neutron-specs/+/662541
Reason: Restore if someday we want this.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.