Activity log for bug #1661105

Date Who What changed Old value New value Message
2017-02-01 22:00:45 rick jones bug added bug
2017-02-01 22:00:45 rick jones attachment added A screenshot of a tcptrace time-seqeunce chart. https://bugs.launchpad.net/bugs/1661105/+attachment/4811946/+files/candidate.png
2017-02-01 22:08:52 rick jones description While troubleshooting some problems various, it became known to the submitter that Octavia seems to want to have its images built with a number of non-default sysctl settings for TCP. In particular: sysctl-write-value net.ipv4.tcp_timestamps 0 sysctl-write-value net.ipv4.tcp_ecn 0 sysctl-write-value net.ipv4.tcp_sack 0 sysctl-write-value net.ipv4.tcp_dsack 0 Disabling tcp_timestamps and tcp_sack severely cripple's TCP's ability to recover from non-trivial losses in a given window. Coupled with Octavia using a 50 second client and server timeout, such non-trivial losses result in a situation like that shown in candidate.png. While candidate.png isn't from an actual Octavia/haproxy connection, it is from one where Octavia's desired sysctl settings have "leaked out" onto the client system. The non-trivial packet loss gets recovered only after retransmission timeouts, and without SACK and timestamps, that RTO cannot reset, so it continues to grow. Ultimately it hits the haproxy client/server timeout setting, at which point haproxy determines insufficient forward progress and terminates the connection with extreme prejudice. The client then sees a connection reset by peer error. Octavia should stop disabling Selective ACKnowledgement and Timestamps. While troubleshooting some problems various, it became known to the submitter that Octavia seems to want to have its images built with a number of non-default sysctl settings for TCP. In particular: sysctl-write-value net.ipv4.tcp_timestamps 0 sysctl-write-value net.ipv4.tcp_ecn 0 sysctl-write-value net.ipv4.tcp_sack 0 sysctl-write-value net.ipv4.tcp_dsack 0 Disabling tcp_timestamps and tcp_sack severely cripples TCP's ability to recover from non-trivial losses in a given window. Coupled with Octavia using a 50 second client and server timeout, such non-trivial losses result in a situation like that shown in candidate.png. While candidate.png isn't from an actual Octavia/haproxy connection, it is from one where Octavia's desired sysctl settings have "leaked out" onto the client system. The non-trivial packet loss gets recovered only after retransmission timeouts, and without SACK and timestamps, that RTO cannot reset, so it continues to grow. Ultimately it hits the haproxy client/server timeout setting, at which point haproxy determines insufficient forward progress and terminates the connection with extreme prejudice. The client then sees a connection reset by peer error. Octavia should stop disabling Selective ACKnowledgement and Timestamps.
2017-02-01 22:35:14 Michael Johnson octavia: importance Undecided High
2017-02-01 22:35:19 Michael Johnson octavia: status New Triaged
2017-02-01 22:40:53 Michael Johnson octavia: assignee Michael Johnson (johnsom)
2017-02-01 22:43:41 OpenStack Infra octavia: status Triaged In Progress
2017-02-02 09:17:47 OpenStack Infra octavia: status In Progress Fix Released
2017-02-14 19:16:37 OpenStack Infra tags in-stable-newton