mysql special characters. utf8mb4 collation

Bug #1714868 reported by Cecilia Vela Gurovic on 2017-09-04
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mahara
Wishlist
Unassigned

Bug Description

Bug reported in forum post
https://mahara.org/interaction/forum/topic.php?id=8034&offset=0&limit=10#post32267

MySQL database as we have in Mahara does not support special characters.

In master branch and mysql DB, I added a text block to a page with text
 "Some characters, like '𝑅' or emoticons '😊', produce the error."

when displaying the page I had this error:
 [WAR] 72 (blocktype/lib.php:685) unserialize(): Error at offset 18 of 48 bytes
 (complete error log in http://paste.wgtn.cat-it.co.nz/15069e#JTBigIqmZagCFwOVF58qww)

in the DB block_instance table, the data it is trying to unserialize
"a:5:{s:4:"text";s:74:"<p>Some characters, like '"
looks like the it ets cut off on the special character position when is saved in the DB

Moodle has a fix for same problem:

"UTF-8 is a character encoding that most websites use. It encodes each of the 1,112,064 valid code points. To store all of this information, four bytes is required. The most popular values are in the three byte region. MySQL by default only uses a three byte encoding and so values in the four byte range (eg. Asian characters and Emojis) can not be stored.
MySQL does provide full four byte UTF-8 support, but it requires certain database settings to be configured."

https://docs.moodle.org/31/en/MySQL_full_unicode_support#Steps_to_upgrade
https://github.com/moodle/moodle/blob/master/admin/cli/mysql_collation.php

I tried using this approach by installing Mahara with utf8mb4

create database `mahara-master` character set utf8mb4 collate utf8mb4_unicode_ci

If we create the database with this char set, we wont be able to install Mahara, as it only supports utf8. Even if we remove this restriction, we will have this error:

mysqli error: [1071: Specified key was too long; max key length is 3072 bytes]

could be similar to this bug: https://bugs.launchpad.net/mahara/+bug/547437

Robert Lyon (robertl-9) on 2017-10-18
Changed in mahara:
milestone: none → 18.04.0
Robert Lyon (robertl-9) wrote :

I've started a patch for this - patch 8160 - and so far just allows for install of Mahara without error when using mysql created by:

 create database `maharadb` DEFAULT CHARACTER SET utf8mb4 DEFAULT COLLATE utf8mb4_unicode_ci;

The mysqli error "[1071: Specified key was too long; max key length is 3072 bytes]" was due to trying to create the 'client_connections_institution' table with a unique index for 4 varchar fields.

But key length would be (255 chars * 4bits) * 4 columns) = 4080 bytes and so too big
- Currently is (255 chars * 3bits) * 4 columns) = 3060 bytes so just fits

But as we do checks in PHP to make sure that every 'name' is unique we don't need the unique index on 4 fields, just the 'name' field.

Changed in mahara:
status: New → In Progress
importance: Undecided → Medium

Reviewed: https://reviews.mahara.org/8160
Committed: https://git.mahara.org/mahara/mahara/commit/5c8a16c517ace72b0d7587bb4b4b938fea998dcc
Submitter: Robert Lyon (<email address hidden>)
Branch: master

commit 5c8a16c517ace72b0d7587bb4b4b938fea998dcc
Author: Robert Lyon <email address hidden>
Date: Thu Oct 19 09:50:41 2017 +1300

Bug 1714868: Allow mysql to be utf8mb4

The changes should allow ithe install of either a utf8 or utf8mb4 site

behatnotneeded

Change-Id: I23479c50f2dc2ac197f244cf723a60e79f88c561
Signed-off-by: Robert Lyon <email address hidden>

Robert Lyon (robertl-9) on 2017-12-14
Changed in mahara:
status: In Progress → Fix Committed
tags: added: nominatedfeature
Changed in mahara:
importance: Medium → Wishlist
Robert Lyon (robertl-9) on 2018-04-05
Changed in mahara:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers