EB Forum

Posted: **Thu Dec 05, 2019 5:25 pm**

tl;dr, it appears as if the character encoding behind the scenes hasn't been migrated properly, this is messing up text throughout the app

Hey all, I've looked through many posts to see if this issue has come up, I found one back in 2012, but it's likely not related to the recent changes the site is currently going through..
(Relevant post: viewtopic.php?f=8&t=102369&p=1723172&hi ... r#p1723172)

There are loads of examples of this, as a developer myself I know the issue lies with character encoding.

Examples:

"Studio Ghibli KÅkyÅ KyokushÅ« 13 Stout - Howl's Moving Castle"
https://expressobeans.com/public/detail.php/175065

Itâ€™s the Easter Beagle, Charlie Brown! 13 Whalen - Standard
https://expressobeans.com/public/detail.php/167375

Canâ€™t Stop Me Now 19 Zhang - 1st
https://expressobeans.com/public/detail.php/275047

April Foolsâ€™ Special 19 STOT21stCplanB - 1st
https://expressobeans.com/public/detail.php/275486

ã“ã®æ˜Ÿã®ã„ãã‚‚ã®é”ã®é¢¨æ™¯(Landscape of this Planet's - 1st
https://expressobeans.com/public/detail.php/263897

Terrible Dream (æã‚ã—ã„å¤¢) 16 Goto - 1st
https://expressobeans.com/public/detail.php/235718

Inversion and Overlap (åè»¢ã¨é‡è¤‡) 17 Shimoda - 1st
https://expressobeans.com/public/detail.php/252761

Details:

I know for a fact that the title of the Tom Whalen, Easter Beagle print should contain a single quotation.. but there are many quotation types:

APOSTROPHE: '
Unicode Character 'LEFT SINGLE QUOTATION MARK' (U+2018): ‘
Unicode Character 'RIGHT SINGLE QUOTATION MARK' (U+2019): ’

(As I'm writing this you'll notice you can see the newly supported characters:)

Expected Result: "It’s the Easter Beagle, Charlie Brown! 13 Whalen - Standard"
Actual Result: "Itâ€™s the Easter Beagle, Charlie Brown! 13 Whalen - Standard"

If you plug those values into the website listed below you'll see the encoding flip-flops that happened along the way:
http://string-functions.com/encodingerror.aspx

Displaying 5 results
utf-8 (65001, Unicode (UTF-8)) -> windows-1250 (1250, Central European (Windows))
utf-8 (65001, Unicode (UTF-8)) -> Windows-1252 (1252, Western European (Windows))
utf-8 (65001, Unicode (UTF-8)) -> windows-1254 (1254, Turkish (Windows))
utf-8 (65001, Unicode (UTF-8)) -> windows-1256 (1256, Arabic (Windows))
utf-8 (65001, Unicode (UTF-8)) -> windows-1258 (1258, Vietnamese (Windows))

These 5 encodings have roughly the same charmaps, so they're interchangeable for this example

---

Solution:

So that leaves us with the main question: do admins understand this issue, and is there a plan to fix it?

Edit: Relevant post: https://devblog.songkick.com/the-great- ... 73f2ec631d

EB Forum

Database Migration - Character Encoding Issues

Database Migration - Character Encoding Issues