Old chassis must be manually removed from OVN southbound DB

Bug #1535716 reported by Na Zhu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
networking-ovn
Confirmed
Wishlist
Unassigned

Bug Description

I follow the instructions in this link http://docs.openstack.org/developer/networking-ovn/testing.html to install ovn by devstack.
At first, I install single node (9.70.33.149, hostname ovn1) successful, and I can see this node by command "ovn-sbctl show"

Then I add another compute node (9.70.22.148, hostname ovn2), i also can see this compute node by command "ovn-sbctl show", but i find in this compute node, the nova-compute can not run, so i remove this compute node and add another compute node, the ip is also 9.70.33.148.

Then in my setup, there are 2 9.70.33.148 in ovn db, but only 1 ovn2 in nova db.
why the old compute node can not be deleted by timeout mechanism because it disappears.
steve@ovn1:~/devstack$ ovn-sbctl show
Chassis "a19a83dd-dd5a-4d48-a633-5d41b51e77ee"
    Encap geneve
        ip: "9.70.33.149"
Chassis "22af05a5-a534-4774-8e73-a1b49559a03e"
    Encap geneve
        ip: "9.70.33.148"
Chassis "5c1a106e-0ac2-4e4f-8386-5f6c0eab0921"
    Encap geneve
        ip: "9.70.33.148"
steve@ovn1:~/devstack$ nova hypervisor-list
+----+---------------------+-------+---------+
| ID | Hypervisor hostname | State | Status |
+----+---------------------+-------+---------+
| 3 | ovn1 | up | enabled |
| 4 | ovn2 | up | enabled |
+----+---------------------+-------+---------+
steve@ovn1:~/devstack$ ovs-vsctl show
082bec78-5e96-4587-97bf-af6a2a3e8ee7
    Bridge br-int
        fail_mode: secure
        Port "tap6f9f462a-56"
            Interface "tap6f9f462a-56"
                type: internal
        Port "ovn-22af05-0"
            Interface "ovn-22af05-0"
                type: geneve
                options: {key=flow, remote_ip="9.70.33.148"}
        Port "ovn-5c1a10-0"
            Interface "ovn-5c1a10-0"
                type: geneve
                options: {key=flow, remote_ip="9.70.33.148"}
        Port br-int
            Interface br-int
                type: internal

Tags: ovn-upstream
Kyle Mestery (mestery)
Changed in networking-ovn:
importance: Undecided → Low
Revision history for this message
Russell Bryant (russellb) wrote :

It's up to the administrator to delete an old chassis from the db right now. You can do it with ovn-sbctl.

    # ovn-sbctl chassis-del CHASSIS

where CHASSIS is either the chassis name or UUID.

Changed in networking-ovn:
status: New → Invalid
Revision history for this message
Na Zhu (nazhu) wrote :

I am not agree with setting this bug as invalid.

The chassis can not be deleted by command "ovn-sbctl chassis-del CHASSIS"

steve@ovn1:~/devstack$ sudo ovn-sbctl show
Chassis "a19a83dd-dd5a-4d48-a633-5d41b51e77ee"
    Encap geneve
        ip: "9.70.33.149"
Chassis "22af05a5-a534-4774-8e73-a1b49559a03e"
    Encap geneve
        ip: "9.70.33.148"
Chassis "5c1a106e-0ac2-4e4f-8386-5f6c0eab0921"
    Encap geneve
        ip: "9.70.33.148"
steve@ovn1:~/devstack$
The chassis 5c1a106e-0ac2-4e4f-8386-5f6c0eab0921 is invalid, but it can not be deleted.

steve@ovn1:~/devstack$ sudo ovn-sbctl chassis-del 5c1a106e-0ac2-4e4f-8386-5f6c0eab0921
steve@ovn1:~/devstack$ sudo ovn-sbctl show
Chassis "5c1a106e-0ac2-4e4f-8386-5f6c0eab0921"
    Encap geneve
        ip: "9.70.33.148"
    Port_Binding "214d89bf-58da-43fb-a83b-5900e19f7c3c"
Chassis "a19a83dd-dd5a-4d48-a633-5d41b51e77ee"
    Encap geneve
        ip: "9.70.33.149"
    Encap geneve
        ip: "9.70.33.148"

Changed in networking-ovn:
status: Invalid → New
Revision history for this message
Han Zhou (zhouhan) wrote :

@Na Zhu, how did you remove the compute node? Remove meaning shutdown/destroy?

It looks weird from your latest comments: deleting a chasis resulted in a port_binding "214d89bf-58da-43fb-a83b-5900e19f7c3c" in the chasis? And chasis "22af05a5-a534-4774-8e73-a1b49559a03e" is gone?

Changed in networking-ovn:
status: New → Incomplete
Revision history for this message
Na Zhu (nazhu) wrote :

@Han Zhou, I just shutdown the compute node.

I think i make a mistake, i delete the new chassis, not the old chassis.
I think the chassis can be deleted by command "ovn-sbctl chassis-del", my justification is why not the old chassis can not be deleted automatically?

Revision history for this message
Han Zhou (zhouhan) wrote :

@Na Zhu, currently OVN is not designed/implemented to have this kind of timeout mechanism.

It seems to be a valid request. At least it would be useful to be able to tell from OVN point of view if a Chassis in sb ovsdb is connected or not. This would require a service in OVN to be able to detect and update some status in sb ovsdb. One possible way may be to monitor the ovsdb connection status. However, this is not something to be supported in networking-ovn plugin. We will discuss in OVN community.

Changed in networking-ovn:
importance: Low → Wishlist
tags: added: ovn-upstream
summary: - The new chassis can not replace old chassis
+ Old chassis must be manually removed from OVN southbound DB
Changed in networking-ovn:
status: Incomplete → Confirmed
Revision history for this message
xiao chuanyun (wanghuiheze) wrote :

Has the question finally been decided?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.