Database Migration - Character Encoding Issues

Share your thoughts or ask all your questions about ExpressoBeans here.
Post Reply
User avatar
Art Expert
Posts: 2690
Joined: Tue May 01, 2012 5:11 pm
Location: San Diego, CA

Thu Dec 05, 2019 5:25 pm

tl;dr, it appears as if the character encoding behind the scenes hasn't been migrated properly, this is messing up text throughout the app

Hey all, I've looked through many posts to see if this issue has come up, I found one back in 2012, but it's likely not related to the recent changes the site is currently going through..
(Relevant post: viewtopic.php?f=8&t=102369&p=1723172&hi ... r#p1723172)

There are loads of examples of this, as a developer myself I know the issue lies with character encoding.

"Studio Ghibli Kōkyō Kyokushū 13 Stout - Howl's Moving Castle"

It’s the Easter Beagle, Charlie Brown! 13 Whalen - Standard

Can’t Stop Me Now 19 Zhang - 1st

April Fools’ Special 19 STOT21stCplanB - 1st

この星のいきもの達の風景(Landscape of this Planet's - 1st

Terrible Dream (恐ろしい夢) 16 Goto - 1st

Inversion and Overlap (反転と重複) 17 Shimoda - 1st

I know for a fact that the title of the Tom Whalen, Easter Beagle print should contain a single quotation.. but there are many quotation types:
Unicode Character 'LEFT SINGLE QUOTATION MARK' (U+2018): ‘
Unicode Character 'RIGHT SINGLE QUOTATION MARK' (U+2019): ’
(As I'm writing this you'll notice you can see the newly supported characters:)
Expected Result: "It’s the Easter Beagle, Charlie Brown! 13 Whalen - Standard"
Actual Result: "It’s the Easter Beagle, Charlie Brown! 13 Whalen - Standard"
If you plug those values into the website listed below you'll see the encoding flip-flops that happened along the way:
Displaying 5 results
utf-8 (65001, Unicode (UTF-8)) -> windows-1250 (1250, Central European (Windows))
utf-8 (65001, Unicode (UTF-8)) -> Windows-1252 (1252, Western European (Windows))
utf-8 (65001, Unicode (UTF-8)) -> windows-1254 (1254, Turkish (Windows))
utf-8 (65001, Unicode (UTF-8)) -> windows-1256 (1256, Arabic (Windows))
utf-8 (65001, Unicode (UTF-8)) -> windows-1258 (1258, Vietnamese (Windows))
These 5 encodings have roughly the same charmaps, so they're interchangeable for this example



So that leaves us with the main question: do admins understand this issue, and is there a plan to fix it?

Edit: Relevant post: ... 73f2ec631d
Creation88 wrote:no top tier artists: no stout, no moss, no taylor, no horkey, no ansin
Post Reply