Comment 6 for bug 979227

Khai Do (zaro0508) wrote :

I tested the utf8 conversion from a mysql dump of the review.o.o data. I found that the existing tables are in 'latin1_general_cs' collation. This is a case sensitive collation which would convert well to utf8_general_cs collation however utf8_general_cs collation is NOT available for mysql (http://bugs.mysql.com/bug.php?id=65830). Converting to utf8_general_ci will not work because there are tables that contain duplicate (case insensitive) data so when attempting to insert the data back into the database duplicate entry errors occur. The other option is to convert to utf8_bin collation which will convert just fine. However the order by clause for utf8_bin returns an unnatural order ( https://stackoverflow.com/questions/5526334/what-effects-does-using-a-binary-collation-have) so I'm not sure whether it's safe to use utf8_bin collation.