Comment 7 for bug 788590

Revision history for this message
Matt Giuca (mgiuca) wrote :

Sorry I didn't reply to this until now.

Transactions *are* *the* (only) way to work around issues where race conditions are causing things to go wrong where they would be fine if executed in sequence. Any time you see a pattern like this:
1. Check if X exists,
2. If not, create X and write it to the datastore.

you know you need transactions, because otherwise, the following situation can always arise:

1. Thread 1 checks if X exists -- it does not,
2. Thread 2 checks if X exists -- it does not,
3. Thread 1 creates a new object X, and writes it to the datastore,
4. Thread 2 creates a new object X, and writes it to the datastore.

This results in duplicate objects called X. No amount of clever code can prevent this from happening, because any two lines of code you write can have another thread come along and do something in the middle. Only transactions, being a database primitive, can prevent this.

> Isn't transactions used in cases such as u want multiple operations
> happening OR none at all.
> This means, that if u have five operations you want happening and and the
> third fails, the transaction would roll back the first succeeded operations
> and throw and error.

Yes, that's what a transaction does. But having multiple operations succeed or fail together is exactly the property that we are interested in here. The way to do the above operation correctly, in a transaction is:

1. Begin a transaction,
2. Check if X exists,
3. If not, create X and write it to the datastore.
4. Commit the transaction.

This transaction has two operations (check and create), and either both of them will succeed, or none of them will (and we will have a ConcurrentModificationException). Now the above sequence looks like this:

1. Thread 1 begins a transaction,
2. Thread 1 checks if X exists -- it does not,
3. Thread 2 begins a transaction,
4. Thread 2 checks if X exists -- it does not,
5. Thread 1 creates a new object X, and writes it to the datastore,
6. Thread 2 creates a new object X, and writes it to the datastore,
7. Thread 1 commits the transaction, and succeeds,
8. Thread 2 commits the transaction and fails with a ConcurrentModificationException.

Now this was bad for thread 2 (the user will see an error and may have to refresh). But importantly, the database integrity is intact. We usually don't want to expose the user to that error, so the last thing we can do in the transaction is catch that exception, and then retry the whole transaction. We repeat it a certain number of times (say, 3), and this is usually enough. In that case, after step 8 there, we will have Thread 2 go and do the whole process again, but this time succeed. If we retry 3 times and still can't do it, we show the user the exception and accept that this will happen very rarely. It is OK for the user to see the occasional Internal Server Error, but it is not OK to put the datastore into an inconsistent state.

So, that's what we *should* be doing basically everywhere in the program. Unfortunately, as Scott found out, App Engine transactions are quite limited and often don't work. I've been (coincidentally, on another project) learning a lot about App Engine transactions this weekend. I still haven't quite gotten my head around the rules, but they have a thing called an "entity group", which we aren't using at the moment. You can only typically apply transactions to entities which are in an entity group. So we won't be able to easily integrate transactions into MUGLE. It would likely require a datastore redesign, which is a major thing, especially now that we have launched (though not impossible).

Having said that, there is another, less correct but better than what we have now, way of working around it. If you make an object's primary key include everything that uniquely identifies that object, then by definition, you can't have duplicates. For example, if the UGP key was actually a string concatenating "userid:gameid", then it would be architecturally impossible to have two UGPs for the same user/game combination. That doesn't make it right however. Without transactions, this could still be possible.

1. Thread 1 checks if X exists -- it does not,
2. Thread 2 checks if X exists -- it does not,
3. Thread 1 creates a new object X, setting property foo and writing it to the datastore,
4. Thread 2 creates a new object X, setting property bar and writing it to the datastore.

At the end, there are no duplicate Xs, but there is still a problem: the property "foo" is missing. What *should* have happened, if the threads executed in sequence, was that Thread 2 picked up the existing X (with "foo" set), and wrote "bar". However, because Thread 2 didn't realise it existed, it overwrote it with a new object that did not contain "foo". Transactions are the only way to fix this.