nfs logging causes unstable system: kernel: xs_tcp_setup_socket: connect returned unhandled error -1

Bug #2060037 reported by Göran Törnqvist
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Undecided
Kleber Sacilotto de Souza

Bug Description

We hit an issue when a nfs server is unreachable and clients keep reconnecting to it causes syslog and kern.log files to be flooded with errors like:

kernel: xs_tcp_setup_socket: connect returned unhandled error -1

Approx 30.000 errors / second. This causes disk space to grow rapidly and fill the disk in a matter of minutes. Only solution appears to be to reboot the nfs servers.

Nfs server is running longhorn 1.5.3 and client mounts options look like:

... type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,softerr,softreval,noresvport,proto=tcp,timeo=600,retrans=5,sec=sys,clientaddr=10.125.92.24,local_lock=none,addr=10.43.81.150)

We did not experience this issue before longhorn introduced "soft" nfs options.
We have a SUSE/Longhorn support ticket (01053865) for this and they suggested us to report this here.

References/Possibly related to:
https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy/commit/?h=Ubuntu-5.15.0-91.101&id=98f930fb6b46b128b72f5635925ec97f2f875d72

***************************************************

# cat /proc/version_signature
Ubuntu 5.15.0-101.111-generic 5.15.143

# lsb_release -rd
Description: Ubuntu 22.04.4 LTS
Release: 22.04

Revision history for this message
Göran Törnqvist (gorantornqvist-conoa) wrote :
Changed in linux (Ubuntu):
assignee: nobody → Kleber Sacilotto de Souza (kleber-souza)
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Jorin Vermeulen (xorinzor) wrote (last edit ):

This error is happening to me too, however I'm not running Ubuntu but Raspberry Pi OS (Debian 12 Bookworm); also with Longhorn.
Given that Ubuntu is based on Debian it might possibly be an issue within Debian.

Revision history for this message
Göran Törnqvist (gorantornqvist-conoa) wrote :

The longhorn bug is fixed now: https://github.com/longhorn/longhorn/issues/8345 and released in 1.5.5
We still hit encounter this logging issue but only for a few minutes when longhorn-nfs pods restart so the impact is much smaller.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.