19.04 beta openssh-client broken pipe

Bug #1822370 reported by FranksMCB on 2019-03-29
24
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openssh (Debian)
Fix Released
Unknown
openssh (Ubuntu)
Critical
Colin Watson
Disco
Critical
Colin Watson

Bug Description

New versions of openssh (as in Ubuntu 19.04) are reported to trigger a connection issue:

   packet_write_wait: Connection to x.x.x.x port 22: Broken pipe

In most of the cases this seems to affect VMWare based environments as there is a bug in their implementation in regard to the traffic shaping protocols.

Until resolved by VMWare the workarounds for now are:

Configure your client to use the old defaults permanently in
=> /etc/ssh/ssh_config
Host *
    IPQoS lowdelay throughput
# You might want to limit to your VMware based systems

Or per command via:
 $ ssh IPQoS="latency throughput" user@host

Two values as one is for interactive and one for non-interactive use cases.

---- original report ----

Upgrade to Xubuntu 19.04 beta from 18.10

openssh-client

when trying to ssh into another system, following error:

packet_write_wait: Connection to x.x.x.x port 22: Broken pipe

Problem is consistent on trying to connect to various systems.

Can confirm was able to ssh prior to upgrade and can ssh into these systems from other systems.

Can use putty on this system to ssh into these boxes as well.

ProblemType: Bug
DistroRelease: Ubuntu 19.04
Package: openssh-client 1:7.9p1-9
ProcVersionSignature: Ubuntu 5.0.0-8.9-generic 5.0.1
Uname: Linux 5.0.0-8-generic x86_64
ApportVersion: 2.20.10-0ubuntu23
Architecture: amd64
CurrentDesktop: XFCE
Date: Fri Mar 29 13:36:38 2019
InstallationDate: Installed on 2018-11-14 (135 days ago)
InstallationMedia: Xubuntu 18.10 "Cosmic Cuttlefish" - Release amd64 (20181017.2)
ProcEnviron:
 LANGUAGE=en_US
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
RelatedPackageVersions:
 ssh-askpass N/A
 libpam-ssh N/A
 keychain N/A
 ssh-askpass-gnome N/A
SSHClientVersion: OpenSSH_7.9p1 Ubuntu-9, OpenSSL 1.1.1b 26 Feb 2019
SourcePackage: openssh
UpgradeStatus: Upgraded to disco on 2019-03-29 (0 days ago)

Related branches

FranksMCB (franksmcb) wrote :
Seth Arnold (seth-arnold) wrote :

Hello,

Are there any messages in dmesg that look related? Can you ping those hosts? Do you get ssh banners if you run:

echo "" | nc x.x.x.x 22

?

Thanks

Running the echo command on those hosts gives me: SSH-2.0-OpenSSH_7.6p1 Ubuntu-4ubuntu0.3
Protocol mismatch.

I can ping those hosts

If I use the linux version of putty on the system I can ssh into those hosts

Not seeing anything related in dmesg

On 3/29/19 2:33 PM, Seth Arnold wrote:
> Hello,
>
> Are there any messages in dmesg that look related? Can you ping those
> hosts? Do you get ssh banners if you run:
>
> echo "" | nc x.x.x.x 22
>
> ?
>
> Thanks
>

Download full text (4.0 KiB)

Maybe the keepalive defaults got changed?
All past references to the issue refer to some sort of keepalive to avoid the issue.
For example [1]

Be aware that some suggestions on [1] configure the sever, while your issue is on the client side (at least that is where the upgrade happened.

You could also run your failing ssh connection with debug enabled, sometimes a message helps to identify the issue
$ ssh -vvv x.x.x.x

You could check if the defaults changed by comparing your old and new setup with -G like:
$ ssh -G x.x.x.x
That will report the configs used.

I compared 18.10 and 19.04 and found those:
$ diff ssh.old ssh.new
3a4
> addkeystoagent false
36d36
< useprivilegedport no
47,49c47,50
< hostkeyalgorithms <email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,ssh-ed25519,rsa-sha2-512,rsa-sha2-256,ssh-rsa
< hostbasedkeytypes <email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,ssh-ed25519,rsa-sha2-512,rsa-sha2-256,ssh-rsa
< kexalgorithms curve25519-sha256,<email address hidden>,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha256,diffie-hellman-group14-sha1
---
> hostkeyalgorithms <email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,ssh-ed25519,rsa-sha2-512,rsa-sha2-256,ssh-rsa
> hostbasedkeytypes <email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,ssh-ed25519,rsa-sha2-512,rsa-sha2-256,ssh-rsa
> kexalgorithms curve25519-sha256,<email address hidden>,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256,diffie-hellman-group14-sha1
> casignaturealgorithms ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,ssh-ed25519,rsa-sha2-512,rsa-sha2-256,ssh-rsa
52c53
< pubkeyacceptedkeytypes <email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,<email address hidden>,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,ssh-ed25519,rsa-sha2-512,rsa-sha2-256,ssh-rsa
---
> pubkeyacceptedkeytypes <email address hidden>,ecdsa-...

Read more...

Changed in openssh (Ubuntu):
status: New → Incomplete

Interesting and maybe related

https://communities.vmware.com/thread/590825
https://github.com/vmware/open-vm-tools/issues/287
https://bugzilla.redhat.com/show_bug.cgi?id=1624437

The TL;DR of those is that VMWare would have issues with the AF21 QoS flag.
Does your client run in VMWare by chance?

Even if not it might be another part of the network setup between your client and server that reacts to the same change.

Please give the:
  $ ssh -o IPQoS=throughput user@host
a check (older default) if that resolves the issue.

I checked for the reason of this default change and found:
I found that the man page has this update about the defaults between cosmic/disco.
The upstream change is:
  https://anongit.mindrot.org/openssh.git/commit/?id=5ee8448ad7c306f05a9f56769f95336a8269f379

There as no follow on change to that yet as far as I can tell from git.

I'd appreciate if you could do the testing with the QoS options (which can also be set in /etc/ssh/ssh_config if you want to make them permanent. Maybe the world isn't ready for the new defaults yet and we might have to hold them back a release?

@cjwatson - I subscribed you as you usually look after ssh(d) - have you heard about that issue before or have any special recommendation already?

FranksMCB (franksmcb) wrote :

Thanks for the responses Christian.

This client does indeed run in VMware, VMware Player 15

Running it using /ssh -o IPQoS=throughput user@host /and it functions
correctly on multiple hosts.

On 4/1/19 4:30 AM, Christian Ehrhardt  wrote:
> Interesting and maybe related
>
> https://communities.vmware.com/thread/590825
> https://github.com/vmware/open-vm-tools/issues/287
> https://bugzilla.redhat.com/show_bug.cgi?id=1624437
>
> The TL;DR of those is that VMWare would have issues with the AF21 QoS flag.
> Does your client run in VMWare by chance?
>
>
> Even if not it might be another part of the network setup between your client and server that reacts to the same change.
>
> Please give the:
> $ ssh -o IPQoS=throughput user@host
> a check (older default) if that resolves the issue.
>
>
> I checked for the reason of this default change and found:
> I found that the man page has this update about the defaults between cosmic/disco.
> The upstream change is:
> https://anongit.mindrot.org/openssh.git/commit/?id=5ee8448ad7c306f05a9f56769f95336a8269f379
>
> There as no follow on change to that yet as far as I can tell from git.
>
> I'd appreciate if you could do the testing with the QoS options (which
> can also be set in /etc/ssh/ssh_config if you want to make them
> permanent. Maybe the world isn't ready for the new defaults yet and we
> might have to hold them back a release?
>
> ** Bug watch added: github.com/vmware/open-vm-tools/issues #287
> https://github.com/vmware/open-vm-tools/issues/287
>
> ** Bug watch added: Red Hat Bugzilla #1624437
> https://bugzilla.redhat.com/show_bug.cgi?id=1624437
>

@Frank - that is good for you and thanks for confirming my assumptions.
But it is unfortunate for openssh :-/

Until resolved - as a landing page (I'll put that in the description as well):
The summary for the workarounds for now is:
You can configure your /etc/ssh/ssh_config permanently
Host *
    IPQoS lowdelay throughput
Or per command via:
 $ ssh -o IPQoS=throughput user@host

---

Per the other bugs and discussions this seems to affect any VMWare backed hosts which I guess most commonly would be:
- VMware based computing centers
- vagrant users through the vmware backend (see [7])
- home users of VMWare player

We will have to make our decision for Ubuntu the same way Fedora did [1] for themselve.
They decided to keep the change and be broken on VMWare for now.

I'd not like to break users right now already with the soon to be released 19.04.
How are the opinions about for now reverting [2] for 19.04 to give VMWare a chance to fix it up in their products.

We could then get into contact with VMWare (I can do that) that this was a close gap action for just now.
Without such a discussion I'm not sure on this as [3][4] are community discussions but no bug reports, as well as [5] being wrong as it isn't a open-vm-tools bug.
But I'd also make clear that we intend drop that revert in 19.10 and later.

I'm polling a few people for opinions on this change (it is easy to do, but hard to decide).

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1624437#c8
[2]: https://anongit.mindrot.org/openssh.git/commit/?id=5ee8448ad7c306f05a9f56769f95336a8269f379
[3]: https://communities.vmware.com/message/2803219
[4]: https://communities.vmware.com/thread/590825
[5]: https://github.com/vmware/open-vm-tools/issues/287
[6]: https://github.com/hashicorp/vagrant/issues/10730

Changed in openssh (Ubuntu):
status: Incomplete → Triaged
importance: Undecided → Critical
tags: added: server-next
description: updated
Colin Watson (cjwatson) wrote :

I don't want to revert this as that just takes the pressure off VMware; ultimately this is their bug that they need to fix.

Colin was so kind to also find this reference [1] in Debian.
Which asks about the same revert, but not yet for it affecting VMWare - instead it seems it also conflicts with "iptables -m tos" as well.

I'll bring it up with VMWare, but if also conflicting with some iptables options that would be one more reason to revert it for 19.04, but none of us has looked deeper into it yet.

[1]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=923879

I filed this bug at Debian as well => https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=926229

Changed in openssh (Debian):
status: Unknown → New

FYI - Up for discussion in the scope of Debian-Buster at [1].

[1]: https://lists.debian.org/debian-devel/2019/04/msg00010.html

Just to be prepared I added an MP for this at [1].
But I agree to cjwatson that we should (if possible) either revert this in Buster+Disco or none of them to not split up the behavior of different releases even more.

Never the less, one might want to peek at the change and or play with the PPA [2].
Therefore it is worth having that ready here on the bug.

[1]: https://code.launchpad.net/~paelzer/ubuntu/+source/openssh/+git/openssh/+merge/365396
[2]: https://launchpad.net/~paelzer/+archive/ubuntu/bug-1822370-openssh-qos-defaults

John Savanyo (jsavanyo) wrote :

I created internal VMware bug 2319367 to track looking into this.

John Savanyo (jsavanyo) wrote :

Can someone clarify, what VMware products is this know to affect (vSphere/ESXi or Workstation/Fusion) and what versions? Also what virtual NIC is used in the VM?

FranksMCB (franksmcb) wrote :

This is Workstation and Player 15 I do not have ability to test on ESXi

I am using VMnet8

On 4/2/19 1:30 PM, John Savanyo wrote:
> Can someone clarify, what VMware products is this know to affect
> (vSphere/ESXi or Workstation/Fusion) and what versions? Also what
> virtual NIC is used in the VM?
>

John Savanyo (jsavanyo) wrote :

The internal VMware bug 2319367 I created was closed as a duplicate of bug 2275007

No feedback on the Debian ML [1] yet, how do we go on with this knowing that our Freeze is in 8 days?

@CJwatson - are you going to drive that in Debian according to Feedback (or the lack thereof) to your mail and then sync it to Disco in time?
For now I'll assume so - no offense please, I just need to know in which field the ball-of-action is right now.

[1]: https://lists.debian.org/debian-devel/2019/04/msg00010.html

Changed in openssh (Ubuntu):
assignee: nobody → Colin Watson (cjwatson)
John Savanyo (jsavanyo) wrote :

VMware bug 2319367 was closed as a duplicate of bug 2275007 which was closed as a duplicate of bug 2201049. Good news is that bug 2201049 is fixed in a future unreleased version of Workstation and Fusion. I'm not aware of a plan to back port fix to maintenance patch yet. I will ask.

Colin Watson (cjwatson) wrote :

@paelzer: A day is a bit soon to be stressing about lack of feedback, I think. We have time.

Colin Watson (cjwatson) wrote :

@jsavanyo: Thanks for following up! The iptables-related part of the discussion may yet yield an openssh packaging change; we'll see.

Changed in openssh (Debian):
status: New → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package openssh - 1:7.9p1-10

---------------
openssh (1:7.9p1-10) unstable; urgency=medium

  * Temporarily revert IPQoS defaults to pre-7.8 values until issues with
    "iptables -m tos" and VMware have been fixed (closes: #923879, #926229;
    LP: #1822370).

 -- Colin Watson <email address hidden> Mon, 08 Apr 2019 11:13:04 +0100

Changed in openssh (Ubuntu Disco):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.