Comment 39 for bug 1817936

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to integ (r/stx.2.0)

Reviewed: https://review.opendev.org/683818
Committed: https://git.openstack.org/cgit/starlingx/integ/commit/?id=457b4747e92255fa99340fa7cb1d7204873fb3c5
Submitter: Zuul
Branch: r/stx.2.0

commit 457b4747e92255fa99340fa7cb1d7204873fb3c5
Author: Sun Austin <email address hidden>
Date: Sun Sep 22 19:33:04 2019 +0800

    Fix Periodic message loss between VIM and Openstack REST APIs

    set net.ipv4.tcp_tw_reuse=0 to avoid dnat conntrack invalid
    and remove customizing ephemeral port range

    The probe connection action before going to time_wait state.
    Probe connection
    controller pod TCP FLAG SEQ ACK
    controller:50538 ---> endpoint:9292 SYN 2707980036 0
    controller:50538 <--- endpoint:9292 SYN ACK 1599414185
    2707980037
    controller:50538 ---> endpoint:9292 ACK 2707980037
    1599414186
    controller:50538 ---> endpoint:9292 FIN ACK 2707980037
    1599414186
    controller:50538 <--- endpoint:9292 ACK 1599414186
    2707980038
    controller:50538 <--- endpoint:9292 FIN ACK 1599414186
    2707980038
    controller:50538 ---> endpoint:9292 ACK 2707980038
    1599414187

    And for the curl command connection with same port 50538: it will be
    like
    controller pod TCP FLAG SEQ ACK
    controller:50538 --> service:9292 SYN 2917708674 0
    controller:50538 --> endpoint:9292 SYN 2917708674 0
    controller:24479 <-- endpoint:9292 SYN ACK 2742336307
    2917708675
    controller:50538 <-- endpoint:9292 SYN ACK 2742336307
    2917708675
    controller:50538 --> service:9292 ACK 2707980038
    1599414187
    controller:50538 --> service:9292 ACK 2707980038
    1599414187
    controller:50538 --> service:9292 ACK(DROP) 2707980038
    1599414187

    The last ACK(controller:50538-->service:9292) SEQ and ACK is same as
    Probe TIME_WAIT latest ACK’s.
    from
    https://github.com/torvalds/linux/blob/v3.10/net/ipv4/tcp_ipv4.c#L2002 ,
    it only check (des ip , des port, src ip, and src port).Because this is
    not a correct SEQ/ACK , then it is set invalid and then dropped.

    If disable tcp_tw_reuse, the port nova-api will be always not same as
     pod probe using, then the issue should be gone.
    set back default(centos) ephemeral port range to avoid ephemeral port
    exhaustion .

    Closes-Bug: 1817936

    Change-Id: I9a39e3b439633ffb82ff05814935eba9ae892eda
    Signed-off-by: Sun Austin <email address hidden>