Autorenewal Events Get Stuck in collected and reacting states
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Evergreen |
Confirmed
|
Undecided
|
Unassigned | ||
3.5 |
Won't Fix
|
Undecided
|
Unassigned | ||
3.6 |
Won't Fix
|
Undecided
|
Unassigned | ||
3.7 |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
Evergreen 3.5.3 (and 3.2.10)
OpenSRF 3.2.1
PostgreSQL 9.6
We occasionally have autorenewal events get stuck in collected or reacting state. It seems intermittent and apparently random. I'm attaching results of a query to count the number of autorenewal events by date and there state where the state was not complete, error, or invalid for the past 6 months. You can see that it doesn't happen all that often.
I have encountered issues with Ejabberd max_stanza_size and these events in the past. Our max_stanza_size is set to 10MB on the utility server that runs the action trigger runner. (I think I've mentioned a/t in the chunking/bundling bug.)
Here is some IRC discussion about this issue that was the impetus for me opening this bug:
Changed in evergreen: | |
status: | New → Confirmed |
Changed in evergreen: | |
assignee: | nobody → Jason Stephenson (jstephenson) |
Changed in evergreen: | |
assignee: | Jason Stephenson (jstephenson) → nobody |
We see this issue in 3.4.4. Currently, we have a workaround to check for these failures and address. It would be nice to not have to have a work around.