BINARY/BLOB should be usually be used instead CHAR utf8; this stores the bytes without any checking.Alas, My SQL's collations think of case folding and accent-stripping as equivalent. It started with just 'utf8', but the 'standard' is becoming 'utf8mb4'.Try a different browser: Chrome works; Firefox is broken. 1970's -- 7-bit ASCII becomes common -- that limits you to English.1950's -- A character was (sometimes) represented in only 5 bits, on "paper tape"; no lowercase; had to "shift" to switch between letters and digits. And the 8-bit "byte" was invented and was coming into common usage (re: IBM 360). Especially since 7-bit ASCII was wasting a bit of the omni-present "byte". 1990's -- The computer world realizes that there are other people in the world and embarks on Unicode and UTF8.
If that acute-e shows up as A-tilde and Copyright, then there may be an issue with the browser.Thus, xxx_unicode_520_ci collations are based on UCA 5.2.0 weight keys: https://Collations without the "520", are based on the older UCA 4.0.0.The column's CHARACTER SET need not agree with SET NAMES; if they differ, a conversion will be performed for you.If your characters exist in both encodings, the conversion will be transparent.For Java/JSP Something about 'filters' SET NAMES can be invoked by your language just like other non-SELECT commands: mysqli_query(), do(), execute(), etc. 'root', and any other user from , so that technique is not 'perfect'. utf8_general_ci is the default for utf8, so you may accidently get this.SET NAMES Declare most CHAR/TEXT columns in all tables as CHARSET utf8. It is a little faster than utf8_unicode_ci and works ok for many situations. CHAR/TEXT utf8 with utf8_bin validates (on INSERT) that the bytes comprise valid utf8 bytes, but does not do anything useful for comparisions other than exact (no case folding, etc) equality.("UTF" = "Unicode Transformation Format") Meanwhile, My SQL is born, but has enough problems without worrying about character sets. You can put any kind of bytes, representing anything, into a VARCHAR. My SQL 4.1 introduced the concept of "character set" and "collation".If you had legacy data or legacy code, you probably did not notice that you were messing things up when you upgraded.I will focus on utf8 and utf8mb4, but if you choose to do otherwise, keep reading; most of this discussion can still be adapted to the charset of your choice.For collation, probably the best 'overall' collation is utf8mb4_unicode_520_ci.