Comment 9 for bug 788590

Revision history for this message
Scott Ritchie (sritchie) wrote : Re: [Bug 788590] Re: Lots of UserGameProfiles created simultaneously

I tried using transactions back when we had owned relationships just
on the use class and they didn't work.

On 30/05/2011, at 5:29 PM, Matt Giuca <email address hidden> wrote:

> Sorry I didn't reply to this until now.
>
> Transactions *are* *the* (only) way to work around issues where race
> conditions are causing things to go wrong where they would be fine
> if executed in sequence. Any time you see a pattern like this:
> 1. Check if X exists,
> 2. If not, create X and write it to the datastore.
>
> you know you need transactions, because otherwise, the following
> situation can always arise:
>
> 1. Thread 1 checks if X exists -- it does not,
> 2. Thread 2 checks if X exists -- it does not,
> 3. Thread 1 creates a new object X, and writes it to the datastore,
> 4. Thread 2 creates a new object X, and writes it to the datastore.
>
> This results in duplicate objects called X. No amount of clever code
> can
> prevent this from happening, because any two lines of code you write
> can
> have another thread come along and do something in the middle. Only
> transactions, being a database primitive, can prevent this.
>
>> Isn't transactions used in cases such as u want multiple operations
>> happening OR none at all.
>> This means, that if u have five operations you want happening and
>> and the
>> third fails, the transaction would roll back the first succeeded
>> operations
>> and throw and error.
>
> Yes, that's what a transaction does. But having multiple operations
> succeed or fail together is exactly the property that we are
> interested
> in here. The way to do the above operation correctly, in a transaction
> is:
>
> 1. Begin a transaction,
> 2. Check if X exists,
> 3. If not, create X and write it to the datastore.
> 4. Commit the transaction.
>
> This transaction has two operations (check and create), and either
> both
> of them will succeed, or none of them will (and we will have a
> ConcurrentModificationException). Now the above sequence looks like
> this:
>
> 1. Thread 1 begins a transaction,
> 2. Thread 1 checks if X exists -- it does not,
> 3. Thread 2 begins a transaction,
> 4. Thread 2 checks if X exists -- it does not,
> 5. Thread 1 creates a new object X, and writes it to the datastore,
> 6. Thread 2 creates a new object X, and writes it to the datastore,
> 7. Thread 1 commits the transaction, and succeeds,
> 8. Thread 2 commits the transaction and fails with a
> ConcurrentModificationException.
>
> Now this was bad for thread 2 (the user will see an error and may have
> to refresh). But importantly, the database integrity is intact. We
> usually don't want to expose the user to that error, so the last thing
> we can do in the transaction is catch that exception, and then retry
> the
> whole transaction. We repeat it a certain number of times (say, 3),
> and
> this is usually enough. In that case, after step 8 there, we will have
> Thread 2 go and do the whole process again, but this time succeed.
> If we
> retry 3 times and still can't do it, we show the user the exception
> and
> accept that this will happen very rarely. It is OK for the user to see
> the occasional Internal Server Error, but it is not OK to put the
> datastore into an inconsistent state.
>
> So, that's what we *should* be doing basically everywhere in the
> program. Unfortunately, as Scott found out, App Engine transactions
> are
> quite limited and often don't work. I've been (coincidentally, on
> another project) learning a lot about App Engine transactions this
> weekend. I still haven't quite gotten my head around the rules, but
> they
> have a thing called an "entity group", which we aren't using at the
> moment. You can only typically apply transactions to entities which
> are
> in an entity group. So we won't be able to easily integrate
> transactions
> into MUGLE. It would likely require a datastore redesign, which is a
> major thing, especially now that we have launched (though not
> impossible).
>
> Having said that, there is another, less correct but better than
> what we
> have now, way of working around it. If you make an object's primary
> key
> include everything that uniquely identifies that object, then by
> definition, you can't have duplicates. For example, if the UGP key was
> actually a string concatenating "userid:gameid", then it would be
> architecturally impossible to have two UGPs for the same user/game
> combination. That doesn't make it right however. Without transactions,
> this could still be possible.
>
> 1. Thread 1 checks if X exists -- it does not,
> 2. Thread 2 checks if X exists -- it does not,
> 3. Thread 1 creates a new object X, setting property foo and writing
> it to the datastore,
> 4. Thread 2 creates a new object X, setting property bar and writing
> it to the datastore.
>
> At the end, there are no duplicate Xs, but there is still a problem:
> the
> property "foo" is missing. What *should* have happened, if the threads
> executed in sequence, was that Thread 2 picked up the existing X (with
> "foo" set), and wrote "bar". However, because Thread 2 didn't
> realise it
> existed, it overwrote it with a new object that did not contain "foo".
> Transactions are the only way to fix this.
>
> --
> You received this bug notification because you are subscribed to
> MUGLE.
> https://bugs.launchpad.net/bugs/788590
>
> Title:
> Lots of UserGameProfiles created simultaneously
>
> Status in Mars Programming Language:
> Invalid
> Status in Melbourne University Game-based Learning Environment:
> Triaged
>
> Bug description:
> For every "real" user game profile created, there are about 10
> completely blank ones, without even a user.