healthmonitor did not take effect

Bug #1607309 reported by Teri Lu
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
octavia
New
High
Unassigned

Bug Description

Create a healthmonitor and add to a pool, shutdown one member but member state is not set to inactive when the max of failed retries is reached for the instance.

Test step:
1. Create healthmonitor and add to pool.
neutron lbaas-healthmonitor-create --delay 5 --max-retries 4 --timeout 3 --type PING --pool 8e94a810-1e5c-48ec-b943-6ea134fdaa1d

2. Show lb status
stack@LB-dev-lbaas01:~$ neutron lbaas-loadbalancer-status 52e25644-ace3-48b4-bb3f-a7cbdd745b87
{
    "loadbalancer": {
        "name": "",
        "provisioning_status": "ACTIVE",
        "listeners": [
            {
                "name": "",
                "provisioning_status": "ACTIVE",
                "pools": [
                    {
                        "name": "pool2",
                        "provisioning_status": "ACTIVE",
                        "healthmonitor": {
                            "provisioning_status": "ACTIVE",
                            "type": "PING",
                            "id": "90c11d9c-3e79-4fad-9e95-6a13da9ed2f1",
                            "name": ""
                        },
                        "members": [
                            {
                                "name": "s2",
                                "provisioning_status": "ACTIVE",
                                "address": "20.0.0.10",
                                "protocol_port": 80,
                                "id": "13c1660a-5c2e-4fb2-a301-1d8187c48c8a",
                                "operating_status": "ONLINE"
                            },
                            {
                                "name": "s1",
                                "provisioning_status": "ACTIVE",
                                "address": "20.0.0.5",
                                "protocol_port": 80,
                                "id": "f78b34f4-41da-4e33-a34b-5ef46bd19a80",
                                "operating_status": "ONLINE"
                            },
                            {
                                "name": "s1",
                                "provisioning_status": "ACTIVE",
                                "address": "20.0.0.9",
                                "protocol_port": 80,
                                "id": "0b8b2db1-4a1b-4320-bd3f-e79a065fce2c",
                                "operating_status": "ONLINE"
                            }
                        ],
                        "id": "8e94a810-1e5c-48ec-b943-6ea134fdaa1d",
                        "operating_status": "ONLINE"
                    }
                ],
                "l7policies": [],
                "id": "12af923e-8248-475c-ab33-e82352d1f19b",
                "operating_status": "ONLINE"
            }
        ],
        "pools": [
            {
                "name": "pool2",
                "provisioning_status": "ACTIVE",
                "healthmonitor": {
                    "provisioning_status": "ACTIVE",
                    "type": "PING",
                    "id": "90c11d9c-3e79-4fad-9e95-6a13da9ed2f1",
                    "name": ""
                },
                "members": [
                    {
                        "name": "s2",
                        "provisioning_status": "ACTIVE",
                        "address": "20.0.0.10",
                        "protocol_port": 80,
                        "id": "13c1660a-5c2e-4fb2-a301-1d8187c48c8a",
                        "operating_status": "ONLINE"
                    },
                    {
                        "name": "s1",
                        "provisioning_status": "ACTIVE",
                        "address": "20.0.0.5",
                        "protocol_port": 80,
                        "id": "f78b34f4-41da-4e33-a34b-5ef46bd19a80",
                        "operating_status": "ONLINE"
                    },
                    {
                        "name": "s1",
                        "provisioning_status": "ACTIVE",
                        "address": "20.0.0.9",
                        "protocol_port": 80,
                        "id": "0b8b2db1-4a1b-4320-bd3f-e79a065fce2c",
                        "operating_status": "ONLINE"
                    }
                ],
                "id": "8e94a810-1e5c-48ec-b943-6ea134fdaa1d",
                "operating_status": "ONLINE"
            }
        ],
        "id": "52e25644-ace3-48b4-bb3f-a7cbdd745b87",
        "operating_status": "ONLINE"
    }
}

3. Shutdown member 20.0.0.5

4. Curl vip, 20.0.0.5 is not reachable
for i in {1..10}; do echo --$i----;sudo ip netns exec qdhcp-1ceb689b-7f81-4820-a26b-02dc276b8c1c curl 20.0.0.13;done
--1----
Welcome to 20.0.0.10
--2----
Welcome to 20.0.0.9
--3----
Welcome to 20.0.0.10
--4----
Welcome to 20.0.0.9
--5----
Welcome to 20.0.0.10
--6----
Welcome to 20.0.0.9
--7----
Welcome to 20.0.0.10
--8----
Welcome to 20.0.0.9
--9----
Welcome to 20.0.0.10
--10----
Welcome to 20.0.0.9

5. Ping member ip, not reachable
sudo ip netns exec qdhcp-1ceb689b-7f81-4820-a26b-02dc276b8c1c ping 20.0.0.5
PING 20.0.0.5 (20.0.0.5) 56(84) bytes of data.
From 20.0.0.2 icmp_seq=1 Destination Host Unreachable
From 20.0.0.2 icmp_seq=2 Destination Host Unreachable
From 20.0.0.2 icmp_seq=3 Destination Host Unreachable

6. Show lb status, the member state did not become inactive

stack@LB-dev-lbaas01:~$ neutron lbaas-loadbalancer-status 52e25644-ace3-48b4-bb3f-a7cbdd745b87
{
    "loadbalancer": {
        "name": "",
        "provisioning_status": "ACTIVE",
        "listeners": [
            {
                "name": "",
                "provisioning_status": "ACTIVE",
                "pools": [
                    {
                        "name": "pool2",
                        "provisioning_status": "ACTIVE",
                        "healthmonitor": {
                            "provisioning_status": "ACTIVE",
                            "type": "PING",
                            "id": "90c11d9c-3e79-4fad-9e95-6a13da9ed2f1",
                            "name": ""
                        },
                        "members": [
                            {
                                "name": "s2",
                                "provisioning_status": "ACTIVE",
                                "address": "20.0.0.10",
                                "protocol_port": 80,
                                "id": "13c1660a-5c2e-4fb2-a301-1d8187c48c8a",
                                "operating_status": "ONLINE"
                            },
                            {
                                "name": "s1",
                                "provisioning_status": "ACTIVE",
                                "address": "20.0.0.5",
                                "protocol_port": 80,
                                "id": "f78b34f4-41da-4e33-a34b-5ef46bd19a80",
                                "operating_status": "ONLINE"
                            },
                            {
                                "name": "s1",
                                "provisioning_status": "ACTIVE",
                                "address": "20.0.0.9",
                                "protocol_port": 80,
                                "id": "0b8b2db1-4a1b-4320-bd3f-e79a065fce2c",
                                "operating_status": "ONLINE"
                            }
                        ],
                        "id": "8e94a810-1e5c-48ec-b943-6ea134fdaa1d",
                        "operating_status": "ONLINE"
                    }
                ],
                "l7policies": [],
                "id": "12af923e-8248-475c-ab33-e82352d1f19b",
                "operating_status": "ONLINE"
            }
        ],
        "pools": [
            {
                "name": "pool2",
                "provisioning_status": "ACTIVE",
                "healthmonitor": {
                    "provisioning_status": "ACTIVE",
                    "type": "PING",
                    "id": "90c11d9c-3e79-4fad-9e95-6a13da9ed2f1",
                    "name": ""
                },
                "members": [
                    {
                        "name": "s2",
                        "provisioning_status": "ACTIVE",
                        "address": "20.0.0.10",
                        "protocol_port": 80,
                        "id": "13c1660a-5c2e-4fb2-a301-1d8187c48c8a",
                        "operating_status": "ONLINE"
                    },
                    {
                        "name": "s1",
                        "provisioning_status": "ACTIVE",
                        "address": "20.0.0.5",
                        "protocol_port": 80,
                        "id": "f78b34f4-41da-4e33-a34b-5ef46bd19a80",
                        "operating_status": "ONLINE"
                    },
                    {
                        "name": "s1",
                        "provisioning_status": "ACTIVE",
                        "address": "20.0.0.9",
                        "protocol_port": 80,
                        "id": "0b8b2db1-4a1b-4320-bd3f-e79a065fce2c",
                        "operating_status": "ONLINE"
                    }
                ],
                "id": "8e94a810-1e5c-48ec-b943-6ea134fdaa1d",
                "operating_status": "ONLINE"
            }
        ],
        "id": "52e25644-ace3-48b4-bb3f-a7cbdd745b87",
        "operating_status": "ONLINE"
    }
}

Revision history for this message
Teri Lu (lujsh-e) wrote :

Config healthmonitor but it did not show in configuration file.

1)show lbaas status, healthmonitor is exists.
neutron lbaas-loadbalancer-status c6adc23e-e8ef-408b-8859-30c5f74c608d
{
    "loadbalancer": {
        "name": "teri_created",
        "provisioning_status": "ACTIVE",
        "listeners": [
            {
                "name": "",
                "provisioning_status": "ACTIVE",
                "pools": [
                    {
                        "name": "",
                        "provisioning_status": "ACTIVE",
                        "healthmonitor": {
                            "provisioning_status": "ACTIVE",
                            "type": "PING",
                            "id": "e1436e3d-3d38-4061-b809-1ed719e7e89e",
                            "name": ""
                        },
                        "members": [
                            {
                                "name": "",
                                "provisioning_status": "ACTIVE",
                                "address": "20.0.0.10",
                                "protocol_port": 443,
                                "id": "60729729-c578-44b0-ba3a-28023eae480d",
                                "operating_status": "ONLINE"
                            },

2) See configuration file in amphora, no healthmonitor.
# Configuration for teri_created
global
    daemon
    user nobody
    group nogroup
    log /dev/log local0
    log /dev/log local1 notice
    stats socket /var/lib/octavia/7c4408c3-0b13-460d-a28d-b96cb4f5c607.sock mode 0666 level user

defaults
    log global
    retries 3
    option redispatch
    timeout connect 5000
    timeout client 50000
    timeout server 50000

peers 7c4408c30b13460da28db96cb4f5c607_peers
    peer jOVtMRyFppkTmH6vacf8iJm5sA4 20.0.0.16:1025
    peer AjoHi411fVQTNUnI_8YS5LNu5KU 20.0.0.15:1025

frontend 7c4408c3-0b13-460d-a28d-b96cb4f5c607
    option tcplog
    bind 20.0.0.19:443
    mode tcp
    default_backend 3118a96b-7ef0-4706-b7b1-cfd23b9b270d

backend 3118a96b-7ef0-4706-b7b1-cfd23b9b270d
    mode tcp
    balance roundrobin
    server 60729729-c578-44b0-ba3a-28023eae480d 20.0.0.10:443 weight 1
    server 3e4473af-3517-48f2-b458-7e82c2e2bd9a 20.0.0.12:443 weight 1
    server 8ac13435-4c51-46eb-a2a5-2ed253d2362a 20.0.0.4:443 weight 1

Revision history for this message
Teri Lu (lujsh-e) wrote :
Download full text (11.0 KiB)

Healthmonitor not support PING, please refer to https://bugs.launchpad.net/neutron/+bug/1426151.

I Try to use HTTPS health-monitor with Haproxy, and amphora has sent TCP requests to all members with port 443. Then shutdown one member, lbaas did not detect this memeber is down.

1) Show lbaas configuration. I have create a healthmonitor and add to pool. timout is 3, max-retries is 3, delay is 5.

ubuntu@amphora-62c2cdac-6f17-451e-b9a8-9965dcc533ca:~$ sudo vi /var/lib/octavia/d8387360-1880-423a-94d3-28d060b05225/haproxy.cfg
sudo: unable to resolve host amphora-62c2cdac-6f17-451e-b9a8-9965dcc533ca
# Configuration for teri_https
global
    daemon
    user nobody
    group nogroup
    log /dev/log local0
    log /dev/log local1 notice
    stats socket /var/lib/octavia/d8387360-1880-423a-94d3-28d060b05225.sock mode 0666 level user

defaults
    log global
    retries 3
    option redispatch
    timeout connect 5000
    timeout client 50000
    timeout server 50000

peers d83873601880423a94d328d060b05225_peers
    peer uj63XrtAl7p8iQXY3EGM1NzVmCQ 20.0.0.6:1025
    peer 8f0Dqs8CsXr0beItRvt1iN7cVrM 20.0.0.16:1025

frontend d8387360-1880-423a-94d3-28d060b05225
    option tcplog
    bind 20.0.0.15:443
    mode tcp
    default_backend 0bae11e7-fff8-4496-a1ce-43db96b21ca7

backend 0bae11e7-fff8-4496-a1ce-43db96b21ca7
    mode tcp
    balance leastconn
    timeout check 3
    option httpchk GET /
    http-check expect rstatus 200
    option ssl-hello-chk
    server 55286990-66a1-4270-93b2-d6adac9415b0 20.0.0.12:443 weight 30 check inter 5s fall 3 rise 3
    server c7ce4502-1190-4542-b139-a9b6bab6ed4a 20.0.0.10:443 weight 50 check inter 5s fall 3 rise 3
    server ebb2aa6e-fcdd-43c0-ab86-7ba0c077ce16 20.0.0.4:443 weight 100 check inter 5s fall 3 rise 3

2) Shutdown member 20.0.0.4, wait about 30min. Show lbaas-status, this member is still active.

stack@lb-dev-sh-ovn-controller:~$ neutron lbaas-loadbalancer-status teri_https
{
    "loadbalancer": {
        "name": "teri_https",
        "provisioning_status": "ACTIVE",
        "listeners": [
            {
                "name": "",
                "provisioning_status": "ACTIVE",
                "pools": [
                    {
                        "name": "pool1",
                        "provisioning_status": "ACTIVE",
                        "healthmonitor": {
                            "provisioning_status": "ACTIVE",
                            "type": "HTTPS",
                            "id": "5347a24f-9ad2-4923-b804-4734555e5ed7",
                            "name": ""
                        },
                        "members": [
                            {
                                "name": "",
                                "provisioning_status": "ACTIVE",
                                "address": "20.0.0.10",
                                "protocol_port": 443,
                                "id": "c7ce4502-1190-4542-b139-a9b6bab6ed4a",
                                "operating_status": "ONLINE"
                            },
                            {
                                "name": "",
                                "provisioning_status": "ACTIVE...

Revision history for this message
Michael Johnson (johnsom) wrote :

The health monitor is working correctly in octavia (per your captured output). The issue here is the CLI request using the neutron client is querying the neutron database for the status information, which is not being updated correctly.

I am going to mark this as duplicate as the issue is actually in neutron-lbaas and being tracked there. See: https://bugs.launchpad.net/neutron/+bug/1548774

Changed in octavia:
importance: Undecided → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.