Skip to content

Add utf8mb4 charset hint to database documentation#315

Open
da2x wants to merge 1 commit into
jacobwb:masterfrom
da2x:patch-2
Open

Add utf8mb4 charset hint to database documentation#315
da2x wants to merge 1 commit into
jacobwb:masterfrom
da2x:patch-2

Conversation

@da2x

@da2x da2x commented Dec 2, 2021

Copy link
Copy Markdown
Contributor

utf8 is an alias for utf8mb3 in MySQL and MariaDB.
Some emojis use 4-bytes, so recommend utf8mb4.

utf8 is an alias for utf8mb3 in MySQL and MariaDB.
Some emojis use 4-bytes, so recommend utf8mb4.
@jacobwb

jacobwb commented Dec 2, 2021

Copy link
Copy Markdown
Owner

Is there any reason not to also use utf8mb4 as the default in secrets.php?
I would like to support all emoji by default, unless there's a good reason not to.

@da2x

da2x commented Dec 2, 2021

Copy link
Copy Markdown
Contributor Author

SQLite, PostgreSQL, and others handles 2–4 bytes from utf8 as per the Unicode standard. MySQL wanted to save RAM back in the day and normalized on utf8 meaning 3-bytes instead; which is why you need to specify utf8mb4 to get full Unicode support. MariaDB inherited this legacy from MySQL. The other database defaults in the secrets file is for SQLite.

So … yeah. Do you want to default to MySQL-legacy-workaround or the guys who’ve followed the Unicode standard without introducing issues for their users? The ambiguity is why I put it in the documentation. It’s a common issue and you might end up with breaking multibyte emojis. But that’s kind of what you get when choosing MySQL/MariaDB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants