Remove-ContainerNetwork hangs after reboot

Bug #1805124 reported by Dariusz Sosnowski
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
Trunk
Fix Committed
Undecided
Dariusz Sosnowski
OpenContrail
Fix Committed
Undecided
Dariusz Sosnowski

Bug Description

When there is a container running, compute node reboot breaks the HNS and container networks cannot be removed.

Bug was found during the vRouter development.

Steps to reproduce:

1. Create a network named `network1` in Contrail controller.
2. Deploy Contrail Windows Compute Node.
3. Disable automatic startup of Contrail Windows services.
3. Create a Docker network assigned to Contrail network:

        docker network create -d Contrail --ipam-driver windows --opt tenant=admin --opt network=network1

4. Start a container in this network:

        docker run -id --network network1 microsoft/windowsservercore

5. Stop the compute node:

        Stop-Computer

6. Power on the compute node.
7. Start Docker and clean up containers.
8. Stop Docker.
9. Attempt to clear up container networks

        Get-ContainerNetwork | Remove-ContainerNetwork

What should happen:

1. Remove-ContainerNetwork should finish and Get-ContainerNetwork should report that there are not networks in HNS.

What happens:

1. Remove-ContainerNetwork hangs and cannot be interrupted.

Other symptoms of the bug:

- after the reboot, `Get-VMSwitch | Get-VMSwitchExtension -Name "vRouter*"` reports that extension is enabled, but it is not running:

        Id : 56553588-1538-4BE6-B8E0-CB46402DC205
        Name : vRouter forwarding extension
        Vendor : OpenContrail team
        Version : 23.29.23.449
        ExtensionType : Forwarding
        ParentExtensionId :
        ParentExtensionName :
        SwitchId : 464d0af9-1a8b-4a81-a981-67c281e87ef0
        SwitchName : Layered Ethernet1
        Enabled : True
        Running : False
        CimSession : CimSession: .
        ComputerName : DS-TB1
        IsDeleted : False

- kernel debugger reports that the VMSwitch protocol drivers failed to bind the adapter (accessible through `!vswitch` WinDbg command):

        VSWITCH ffffba815fa48000
            Name: 714626e9-a78f-41f8-9dbe-77a15dc91962
            Friendly Name: Layered Ethernet1
            RefCount: 2 Show references

            Extensibility
            ProtocolState: 4 (VmsExtPtStateBindAdapterFailed)
            MiniportState: 2 (VmsExtMpStateHalted)
            DeviceId: {DA463AF6-AC4A-4EB9-97F5-DAEF147504E3}
            Protocol Open: ffffba8166df34f0 !ndiskd.mopen ffffba8166df34f0
            IovPreferred: 0
            SwitchEmbeddedTeaming: FALSE

            vPort List: Count (1)
            Type Address Id Type IsValidation MonitorMode FriendlyName
            VPORT ffffba815fa45000 0x0000 Default N None Layered Ethernet1

Tags: windows
description: updated
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/47862
Submitter: Dariusz Sosnowski (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/47862
Committed: http://github.com/Juniper/contrail-vrouter/commit/0be1b0a79ff9773c6f07cd6df8fdee5fb44c326a
Submitter: Zuul v3 CI (<email address hidden>)
Branch: master

commit 0be1b0a79ff9773c6f07cd6df8fdee5fb44c326a
Author: Dariusz Sosnowski <email address hidden>
Date: Tue Nov 27 13:03:28 2018 +0100

Fix vRouter Extension failing to initialize after reboot

When a container is running on Contrail Windows compute node and
the node is rebooted, extension fails to initialize. As a consequence
container networks cannot be removed.

Failure is reported by vRouter's overlying drivers. They did not
receive any PnP events from underlying drivers. This PR adds required
event forwarding to vRouter Extension code.

Change-Id: Id7a5813440a894d49c49a8638fe99b2e466516ba
Closes-Bug: #1805124

Changed in opencontrail:
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.