Comment 10 for bug 2034701

Revision history for this message
Hajo Locke (hajo-locke) wrote :

I did found a workaround, but iam still convinced that we have a kind of bug.
I think i should explain our typical systemsetup, for better understandig.

Typical field of application are failoversystems. overall we use very few software and systems have minimal load.
we have 2 servers in a cluster realised with pacemaker/corosync. they manage a reource haproxy and a floating ip. We do this with different ubuntu OS, from 18.04 over 20.04 to 22.04

Our systems are bound to Windows Active Director with SSSD (System Security Services Daemon) (https://schroeffu.ch/2019/09/linux-active-directory-ldap-ssh-login-mit-sssd-und-realmd/) so it is possible to Login with our AD Credentials.

last component is altiris server management suite agent (former symantec now broadcom) wich is running with root privileges and helps to manage our computerlandscape. And this is where i located the problem.

every evening the agent runs a bash script which was wrote by me 3 years ago. it is a small script with 90 lines, it collects some data, mounts a windows fileshare and finally uploads some small files before unmounting the share. nothing special, it takes around 5 seconds to complete, but here seems to be the problem.

As i can see every affected server shows in syslog this lines about kernel bug i uploaded on 2023-09-11 (#4) . In some cases there happens something unexpected and triggers this bug. this happens since 5.15.0-83-generic
the system gets unstable, high load without running processes, every command takes forever to complete. mostly we had to do a vmware hardstop, because even "reboot -f" failed. i uploaded already some logs.
I deactivated this job and the problems disappeared. i was not able to trigger this problem by manual run of the script. as the job was active, every morning we had a bunch of servers in this state between life and death.
So i can not confirm a change on our site, i still think about a newly introduced kind of bug.
I would like to hear from you, please tell me your opinion to this case. strange that i report a bug with documented kernel error and no one gets back to me.

Thanks,
Hajo