AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl() on a Ubuntu 20.04 NFSV4 server

Bug #2042363 reported by bugproxy
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
New
Undecided
Ubuntu on IBM Power Systems Bug Triage

Bug Description

---Problem Description---
AIX 7.3 NFS client frequently returns an EIO error to an application when reading or writing to a file that has been locked with fcntl(). NFS server is Ubuntu 20.04.6 LTS, GNU/Linux 5.4.0-139-generic x86_64. The problem does not appear to affect other combinations of NFS client (including AIX 7.2) with this NFS server.

The AIX team have indicated that the cause of the EIO is triggered by the NFS server returning a BAD_SEQID error which leads to the AIX NFS client incorrectly zeroing the stateid, which then leads to the NFS server returning a BAD_STATEID error and the NFS client then returns the EIO error. The AIX team would like to understand why the BAD_SEQID has been returned.

---uname output---
Linux duckseason 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Machine Type = VMware ESXi Server 7.0 4 x Intel(R) Xeon(R) Gold 6348H CPU @ 2.30GHz

---Steps to Reproduce---
 We cannot offer a simple way to recreate the problem as it involves IBM MQ running on two primary machines (AIX) using the Ubuntu server for it's HA NFSv4 storage.

However, we can provide any requested trace or dumps from any or all of the involved machines.

bugproxy (bugproxy)
tags: added: architecture-x8664 bugnameltc-203937 severity-high targetmilestone-inin2004
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
Revision history for this message
Frank Heimes (fheimes) wrote :

Hi, reading the bug description one of the first things that caught my attention is that the kernel (5.4.0-156) seems to be a bit outdated, since the latest (aot) is 5.4.0.166.
Would you mind retrying this with the latest kernel (ideally actually with the latest userspace, for example after having done a 'apt update' and 'apt full-upgrade'), since there will be a difference of hundreds of patches (also upstream stable) between these.

On top I believe it would probably very helpful to have rpcdebug enabled for the NFS Server, like:
rpcdebug -m nfsd -s all
rpcdebug -m nlm -s all
rpcdebug -m rpc -s all

Btw. it would also be interesting to know it this also happens with a bare-metal install of the NFS server, means without having VMware in between (avoiding any potential flaws with VMware virtual network components, like virtual switches.).

I think technically this is not a Launchpad bug for the Ubuntu on IBM Power project, since here Ubuntu runs on amd64 (and on VMware), but we may still try to figure out what's going on.

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2023-11-14 05:51 EDT-------
Ok,
I will arrange for an update/upgrade to the latest kernel in the next week or so and turn on those rpcdebug settings when testing the issue. Thanks

Revision history for this message
bugproxy (bugproxy) wrote : Trace of NFSv4 server

------- Comment on attachment From <email address hidden> 2023-12-06 04:38 EDT-------

AIX NFS client returns ESTALE at approximately 11:17:24.

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2024-01-30 16:28 EDT-------
Is there an update from Canonical please? Can we provide any further information or trace to help diagnose this issue ?

Revision history for this message
Frank Heimes (fheimes) wrote :

I did a screening of the traces, but couldn't really find suspicious entries.
I'm now looking for a someone else's view and opinion...

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2024-02-13 16:49 EDT-------
(In reply to comment #14)
> I did a screening of the traces, but couldn't really find suspicious entries.
> I'm now looking for a someone else's view and opinion...

Thanks Frank! I hope you found someone else to help you. Please let us know if y'all find anything of interest or you need additional information from us.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2024-03-11 03:51 EDT-------
Has there been any further progress on this issue from the Linux NFS development team ?

The AIX NFS development team will not progress the problem from their side until they understand why the BAD_SEQID errors has been returned from the Linux NFSv4 server.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2024-03-19 13:53 EDT-------
Is there an update from Canonical please? Can we provide any further information or trace to help diagnose this issue ?

Revision history for this message
GuoqingJiang (guoqingjiang) wrote :

Per below from the trace file

Nov 30 11:13:40 duckseason kernel: [1291756.354728] nfsd_dispatch: vers 4 proc 1
Nov 30 11:13:40 duckseason kernel: [1291756.354731] svc: server 000000007c7e7536, pool 0, transport 000000003fd86d34, inuse=3
Nov 30 11:13:40 duckseason kernel: [1291756.354732] process_renew(6554b87b/4ab45507): starting
Nov 30 11:13:40 duckseason kernel: [1291756.354734] svc: tcp_recv 000000003fd86d34 data 1 conn 0 close 0
Nov 30 11:13:40 duckseason kernel: [1291756.354736] svc: socket 000000003fd86d34 recvfrom(0000000003fecffb, 4) = -11
Nov 30 11:13:40 duckseason kernel: [1291756.354737] RPC: TCP recv_record got -11
Nov 30 11:13:40 duckseason kernel: [1291756.354737] RPC: TCP recvfrom got EAGAIN

we can see NFS server return -11 (EAGAIN), which can be executed from from the path,

svc_recv -> svc_handle_xprt
            -> xprt->xpt_ops->xpo_recvfrom
               svc_tcp_recvfrom
               -> svc_recvfrom
                  -> sock_recvmsg which probably triggers sock_recvmsg_nosec -> ... -> tcp_recvmsg

As mentioned in recvfrom manpage,

ERRORS
       The recvfrom() function shall fail if:
       EAGAIN or EWOULDBLOCK
              The socket's file descriptor is marked O_NONBLOCK and no data is
              waiting to be received; or MSG_OOB is set and no out-of-band
              data is available and either the socket's file descriptor is
              marked O_NONBLOCK or the socket does not support blocking to
              await out-of-band data.

I am not sure if 7.3 NFS client opened non-blocking socket and no data on that socket to be read.
So I would like to check if 7.3 client sent something different compared with 7.2 client which caused server returned BAD_SEQID to AIX 7.3 client.

Please also collect relevant trace log from server side when connecting with 7.2 client, then we can investigate the difference between good one and bad one.

If possible, maybe you can try with the latest 5.4 stable (5.4.274) and upstream version (6.9-rc4).

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2024-04-17 05:37 EDT-------
Hi,
I have run the same MQ HA test on a working(AIX 7.2) and non working(AIX 7.3) pair of machines with the same Linux NVSv4 Ubuntu 20.04 (kernel updated to 5.4.0-176-generic). I used the same RPC debug trace options as previously.

Here are the details of the uploaded tar.gz file called CASE_TS012306761.tar.gz. I do have other trace files if you need them.

AIX 7.3 test was between machine amaliada and partner adamsongrunter on 16/04/2024 starting at 14:01 and failing at 14:04:07:971116 with a failure to write a file on the shared NFSv4 server.

There is a partial syslog from the Linux NVSv4 server duckseason covering that period called TS012306761/syslog_16042024_amaliada_primary_adamsongrunter_partner_both_aix73.log

Covering the same period, there are a set of tcpdump files in subfolder TS012306761/tcpdumps_amaliada_primary_admasongrunter_partner_both_aix73/

----------------
AIX 7.2 test was between primary machine adia and partner amberjack on 17/04/2024 starting at 09:09:48 and ending successfully at 09:16:29.

There is a partial syslog from the linux NVSv4 server duckseason covering that period called TS012306761/syslog_17042024_adia_primary_amberjack_partner_both_aix72.log

Covering the same period, there are a set of tcpdump files in subfolder TS012306761/tcpdumps_adia_primary_amberjack_partner_both_aix72/

We are happy to supply further info to help resolve this issue. Thanks for your help and advice.

Revision history for this message
bugproxy (bugproxy) wrote : Syslog and tcpdump file covering a working AIX 7.2 and non-working AIX 7.3 pair of machines using the same NFSv4 server

------- Comment (attachment only) From <email address hidden> 2024-04-17 06:10 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment (attachment only) From <email address hidden> 2024-04-17 06:11 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : Syslog for AIX 7.2 primary and partner

------- Comment (attachment only) From <email address hidden> 2024-04-17 06:12 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : tcpdump for aix 7.3 testing part 1 of 5

------- Comment (attachment only) From <email address hidden> 2024-04-17 06:14 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : tcpdump for aix 7.3 testing part 2 of 5

------- Comment (attachment only) From <email address hidden> 2024-04-17 06:15 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : tcpdump for aix 7.3 testing part 3 of 5

------- Comment (attachment only) From <email address hidden> 2024-04-17 06:15 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : tcpdump for aix 7.3 testing part 4 of 5

------- Comment (attachment only) From <email address hidden> 2024-04-17 06:16 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : tcpdump for aix 7.3 testing part 5 of 5

------- Comment (attachment only) From <email address hidden> 2024-04-17 06:16 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : tcpdump for aix 7.2 testing part 1 of 4

------- Comment (attachment only) From <email address hidden> 2024-04-17 06:17 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : tcpdump for aix 7.2 testing part 2 of 4

------- Comment (attachment only) From <email address hidden> 2024-04-17 06:18 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : tcpdump for aix 7.2 testing part 3 of 4

------- Comment (attachment only) From <email address hidden> 2024-04-17 06:18 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : tcpdump for aix 7.2 testing part 4 of 4

------- Comment (attachment only) From <email address hidden> 2024-04-17 06:19 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2024-04-17 13:11 EDT-------
Here are the details of the IP addresses for the tcpdump files

AIX 7.3
9.20.120.112 name = amaliada.v6.hursley.ibm.com -- Primary
9.20.120.153 name = adamsongrunter.v6.hursley.ibm.com ? Partner

AIX 7.2
9.20.120.127 name = adia.v6.hursley.ibm.com -- Primary
9.20.121.46 name = amberjack.v6.hursley.ibm.com ? Partner

NFSv4 server
9.20.32.85 name = duckseason.hursley.ibm.com

Revision history for this message
GuoqingJiang (guoqingjiang) wrote :

Sorry, I can't distinguish which parts of logs in the attachments (#comment11, #comment12 and #comment13) are belong to the connection from working 7.2 and non-working 7.3. All the attachments have "TCP recvfrom got EAGAIN" which should from the connection for 7.3.

$ grep "TCP recvfrom got EAGAIN" syslog_16042024_amaliada_primary_adamsongrunter_partner_both_aix73_part1.log -r|wc -l
213127
$ grep "TCP recvfrom got EAGAIN" syslog_16042024_amaliada_primary_adamsongrunter_partner_both_aix73_part2.log -r|wc -l
226005
$ grep "TCP recvfrom got EAGAIN" syslog_17042024_adia_primary_amberjack_partner_both_aix72.log -r|wc -l
20233

May I suggest to collect those logs in two separated files? One from 7.2 and another from 7.3 instead of mix them together.

Not an network expert, but I see some NFS RENEW ops packets between 9.20.32.85 (server) and 9.20.120.127 (7.2 client) in tcp_dump17_04_2024_09H_10M, but no such RENEW packets for 9.20.32.85 (server) and 9.20.120.112 (7.3 client) in tcpdump16_04_2024_14H_03M. Given NFS4 is a stateful fs which is based on leases, without client send an operation to renew the lease, it is possible for server to return EAGAIN. And please check if 7.3 client is not same as 7.2 client regarding lease renewing.

Revision history for this message
GuoqingJiang (guoqingjiang) wrote :
Revision history for this message
GuoqingJiang (guoqingjiang) wrote :
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2024-04-18 11:00 EDT-------
Hi,
Thanks for looking at the logs. I agree that both syslogs (AIX 7.2 and 7.3) contain multiple "TCP recvfrom got EAGAIN" entries.

To clarify the information in the attachments, I have updated the comment for the attachments of the syslog for the non-working AIX 7.3 pair (amaliada and adamsongrunter. The syslog is in two parts because the Bugzilla file uploader couldn't handle such a large file. It is NOT mixed with the AIX 7.2 syslog.

The syslog for the working pair AIX 7.2 (adia and amberjack) was smaller, so it is a single file attachment.

I will ask the IBM AIX support team about the NFS Client lease renewal in both AIX 7.2 and 7.3.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2024-04-24 02:47 EDT-------
Hi,
We have had a reply from the AIX support team (see below) explaining the RENEW requests. Do you have any further questions for the AIX support team or can we supply any further diagnostic information to progress this issue ?

Regarding the comment/question about RENEW reqs from Linux:

RENEW is done when half the lease period has elapsed without any state operation going on.

Here, the Linux server's lease is 90 seconds. From an NFS perspective, there is nothing strange or remarkable about the pattern of RENEW reqs sent in the good & bad cases. If there is a specific question about renews in the latest captures let us know.

Revision history for this message
GuoqingJiang (guoqingjiang) wrote :

Per comment#23, the ip from AIX 7.2 client are:

9.20.120.127 name = adia.v6.hursley.ibm.com -- Primary
9.20.121.46 name = amberjack.v6.hursley.ibm.com ? Partner

And I searched the trace again with above ips, looks socket 00000000cc6f0db2 is created between 9.20.120.127 and nfs server, however it can also return EAGAIN.

duckseason kernel: [13254.724411] svc: socket 00000000cc6f0db2 sendto([000000008485f39d 72... ], 72) = 72 (addr 9.20.120.127, port=1022)
...
duckseason kernel: [13254.724734] svc: socket 00000000cc6f0db2(inet 00000000c831762e), busy=0
duckseason kernel: [13254.724759] svc: server 00000000728e82a2, pool 0, transport 00000000cc6f0db2, inuse=2
duckseason kernel: [13254.724761] svc: tcp_recv 00000000cc6f0db2 data 1 conn 0 close 0
duckseason kernel: [13254.724765] svc: socket 00000000cc6f0db2 recvfrom(00000000b6708704, 4) = 4
duckseason kernel: [13254.724766] svc: TCP record, 168 bytes
duckseason kernel: [13254.724769] svc: socket 00000000cc6f0db2 recvfrom(0000000057dbced3, 4096) = 168
duckseason kernel: [13254.724771] svc: TCP final record (168 bytes)
duckseason kernel: [13254.724775] svc: svc_authenticate (1)
duckseason kernel: [13254.724779] svc: server 00000000ee62a401, pool 0, transport 00000000cc6f0db2, inuse=3
duckseason kernel: [13254.724780] svc: tcp_recv 00000000cc6f0db2 data 1 conn 0 close 0
duckseason kernel: [13254.724783] svc: socket 00000000cc6f0db2 recvfrom(00000000b6708704, 4) = -11

And it is same for socket 000000003497acd5 which is used between 9.20.121.46 and nfs server.

duckseason kernel: [13254.802249] svc: socket 000000003497acd5 sendto([0000000086e5a045 72... ], 72) = 72 (addr 9.20.121.46, port=1020)
...
duckseason kernel: [13254.802533] svc: socket 000000003497acd5(inet 0000000072c9551d), busy=0
duckseason kernel: [13254.802571] svc: server 00000000728e82a2, pool 0, transport 000000003497acd5, inuse=2
duckseason kernel: [13254.802573] svc: tcp_recv 000000003497acd5 data 1 conn 0 close 0
duckseason kernel: [13254.802578] svc: socket 000000003497acd5 recvfrom(0000000077f9cf7c, 4) = 4
duckseason kernel: [13254.802579] svc: TCP record, 164 bytes
duckseason kernel: [13254.802583] svc: socket 000000003497acd5 recvfrom(0000000057dbced3, 4096) = 164
duckseason kernel: [13254.802585] svc: TCP final record (164 bytes)
duckseason kernel: [13254.802590] svc: svc_authenticate (1)
duckseason kernel: [13254.802596] svc: server 00000000ee62a401, pool 0, transport 000000003497acd5, inuse=3
duckseason kernel: [13254.802597] svc: tcp_recv 000000003497acd5 data 1 conn 0 close 0
duckseason kernel: [13254.802599] svc: socket 000000003497acd5 recvfrom(0000000077f9cf7c, 4) = -11

But since aix 7.2 client can work with the same server according to bug description, I am curious why 7.2 client also return EAGAIN which is same as 7.3 client, what am I missing?

Some questions/suggestion:

1. Did aix 7.3 nfs client work with previous kernel? If so, run "git bisect" to find which commit caused the issue.
2. Is it possible to try with latest 5.4 stable kernel as suggested in comment#1? Also try latest upstream kernel (6.9-rc5 at this time) as well.
3. Does increase lease time make difference?

To post a comment you must log in.