Potential race condition in oils_ctl.sh and restart_sip command
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Evergreen |
New
|
Undecided
|
Unassigned |
Bug Description
We run a oils_ctl.sh every morning at 2:05 am to restart our SIPServer processes. This serves as a way to make sure that new accounts and configuration changes are picked up in a timely manner without disrupting library service.
There appears to have been a race condition in the restart on one of our servers this morning. Below are log entries that show the SIPServer handling a request at 2:04:59, the cron job running at 2:05:01, the SIPServer supposedly shutting down, and the new process failing to start because the port is still in use. I am not certain how much of a role the request at 2:04:59 plays in this, but I suspect that it does play a role because this sequence of events almost never happens.
Oct 2 02:04:59 sip3 SIPServer.pm: read_SIP_packet, INPUT MSG: '9900002.00'
Oct 2 02:04:59 sip3 SIPServer.pm: read_SIP_packet, INPUT MSG: '9900002.00'
Oct 2 02:04:59 sip3 SIPServer.pm: INPUT MSG: '9900002.00'
Oct 2 02:04:59 sip3 SIPServer.pm: Sip::MsgType:
Oct 2 02:04:59 sip3 SIPServer.pm: Sip::MsgType:
Oct 2 02:04:59 sip3 SIPServer.pm: Sip::MsgType:
Oct 2 02:04:59 sip3 SIPServer.pm: OUTPUT MSG: '98YYYYYN000003
Oct 2 02:04:59 sip3 SIPServer.pm: SIP processing duration 0.000 : 9900002.00
Oct 2 02:05:01 sip3 CRON[23371]: (opensrf) CMD ($EG_BIN_
Oct 2 02:05:01 sip3 acs-server[10418]: 2021/10/02-02:05:01 Server closing!
Oct 2 02:05:02 sip3 acs-server[23377]: 2021/10/02-02:05:02 SIPServer (type Net::Server:
Oct 2 02:05:02 sip3 acs-server[23377]: Resolved [*]:6001 to [::]:6001, IPv6
Oct 2 02:05:02 sip3 acs-server[23377]: Not including resolved host [0.0.0.0] IPv4 because it will be handled by [::] IPv6
Oct 2 02:05:02 sip3 acs-server[23377]: Binding to TCP port 6001 on host :: with IPv6
Oct 2 02:05:02 sip3 acs-server[23377]: 2021/10/02-02:05:02 Can't connect to TCP port 6001 on :: [Address already in use]#012 at line 64 in file /usr/share/
Oct 2 02:05:02 sip3 acs-server[23377]: 2021/10/02-02:05:02 Server closing!
summary: |
- Potential race condition in oils_ctl.sh and restar_sip command + Potential race condition in oils_ctl.sh and restart_sip command |
description: | updated |
tags: | added: performance sip |