juju shouldn't let txn-queues grow out of control

Bug #1778907 reported by Junien F
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
John A Meinel
2.3
Fix Released
High
John A Meinel

Bug Description

Hi,

We got it by a bug today where the txn-resumer was unable to resume a transaction (because of a bug in the code).

In less than 2 days, the txn-queue for a few documents grew to over 25k. We thankfully caught the problem relatively fast and were able to deal with it, but agents now need to process these 25k txns.

If an operation fails, juju shouldn't let txn-queues grow out of control because of constant retries.

Thanks

Junien F (axino)
summary: - juju shouldn't let txn-queues growing out of control
+ juju shouldn't let txn-queues grow out of control
Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1778907] [NEW] juju shouldn't let txn-queues growing out of control

What version of Juju was running? We have put some changes in place to
avoid letting the Queue grow too large. I believe we had a release bug
where early 2.3 releases weren't getting all of the patches for our
dependencies.

John
=:->

On Wed, Jun 27, 2018, 16:30 Junien Fridrick <email address hidden>
wrote:

> Public bug reported:
>
> Hi,
>
> We got it by a bug today where the txn-resumer was unable to resume a
> transaction (because of a bug in the code).
>
> In less than 2 days, the txn-queue for a few documents grew to over 25k.
> We thankfully caught the problem relatively fast and were able to deal
> with it, but agents now need to process these 25k txns.
>
> If an operation fails, juju shouldn't let txn-queues grow out of control
> because of constant retries.
>
> Thanks
>
> ** Affects: juju
> Importance: Undecided
> Status: New
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1778907
>
> Title:
> juju shouldn't let txn-queues growing out of control
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1778907/+subscriptions
>

Revision history for this message
Junien F (axino) wrote :

Ah yes sorry, this was a 2.3.7 controller.

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1778907] Re: juju shouldn't let txn-queues grow out of control

So we do have this patch:
https://github.com/juju/juju/blob/develop/patches/max_txn_queue_length_pr463.diff

which theoretically uses MaxTxnQueueLength (default of 1000), which
immediately removes a txn if it was the 1001st txn.

However, looking at the source for 2.3.7 and 2.3.8 I don't see that patch
in the source tarball:
 https://launchpad.net/juju/+download

I *do* see this patch having been applied:

https://github.com/juju/juju/blob/develop/patches/mgo_server_abended_255.diff

I do see the patch applied in the 2.4 series.

And now that I've dug into it, I remember there was a bug that the name of
the file was ".patch" instead of ".diff" in the 2.3 branch, so it wasn't
getting applied.
It looks like that still hasn't been fixed in the 2.3 series, so I'll go do
that now.

On Thu, Jun 28, 2018 at 9:08 AM, Junien Fridrick <<email address hidden>
> wrote:

> Ah yes sorry, this was a 2.3.7 controller.
>
> --
> You received this bug notification because you are subscribed to juju.
> Matching subscriptions: juju bugs
> https://bugs.launchpad.net/bugs/1778907
>
> Title:
> juju shouldn't let txn-queues grow out of control
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/juju/+bug/1778907/+subscriptions
>

Revision history for this message
John A Meinel (jameinel) wrote :

this should already be in all of the 2.4 releases, so I'm marking it fixed there.

Changed in juju:
assignee: nobody → John A Meinel (jameinel)
importance: Undecided → High
status: New → Fix Released
Revision history for this message
John A Meinel (jameinel) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.