Database Migration - Character Encoding Issues
Posted: Thu Dec 05, 2019 5:25 pm
tl;dr, it appears as if the character encoding behind the scenes hasn't been migrated properly, this is messing up text throughout the app
Hey all, I've looked through many posts to see if this issue has come up, I found one back in 2012, but it's likely not related to the recent changes the site is currently going through..
(Relevant post: viewtopic.php?f=8&t=102369&p=1723172&hi ... r#p1723172)
There are loads of examples of this, as a developer myself I know the issue lies with character encoding.
Examples:
I know for a fact that the title of the Tom Whalen, Easter Beagle print should contain a single quotation.. but there are many quotation types:
http://string-functions.com/encodingerror.aspx
---
Solution:
So that leaves us with the main question: do admins understand this issue, and is there a plan to fix it?
Edit: Relevant post: https://devblog.songkick.com/the-great- ... 73f2ec631d
Hey all, I've looked through many posts to see if this issue has come up, I found one back in 2012, but it's likely not related to the recent changes the site is currently going through..
(Relevant post: viewtopic.php?f=8&t=102369&p=1723172&hi ... r#p1723172)
There are loads of examples of this, as a developer myself I know the issue lies with character encoding.
Examples:
Details:"Studio Ghibli KÅkyÅ KyokushÅ« 13 Stout - Howl's Moving Castle"
https://expressobeans.com/public/detail.php/175065
It’s the Easter Beagle, Charlie Brown! 13 Whalen - Standard
https://expressobeans.com/public/detail.php/167375
Can’t Stop Me Now 19 Zhang - 1st
https://expressobeans.com/public/detail.php/275047
April Fools’ Special 19 STOT21stCplanB - 1st
https://expressobeans.com/public/detail.php/275486
ã“ã®æ˜Ÿã®ã„ãã‚‚ã®é”ã®é¢¨æ™¯(Landscape of this Planet's - 1st
https://expressobeans.com/public/detail.php/263897
Terrible Dream (æã‚ã—ã„夢) 16 Goto - 1st
https://expressobeans.com/public/detail.php/235718
Inversion and Overlap (å転ã¨é‡è¤‡) 17 Shimoda - 1st
https://expressobeans.com/public/detail.php/252761
I know for a fact that the title of the Tom Whalen, Easter Beagle print should contain a single quotation.. but there are many quotation types:
(As I'm writing this you'll notice you can see the newly supported characters:)APOSTROPHE: '
Unicode Character 'LEFT SINGLE QUOTATION MARK' (U+2018): ‘
Unicode Character 'RIGHT SINGLE QUOTATION MARK' (U+2019): ’
If you plug those values into the website listed below you'll see the encoding flip-flops that happened along the way:Expected Result: "It’s the Easter Beagle, Charlie Brown! 13 Whalen - Standard"
Actual Result: "It’s the Easter Beagle, Charlie Brown! 13 Whalen - Standard"
http://string-functions.com/encodingerror.aspx
These 5 encodings have roughly the same charmaps, so they're interchangeable for this exampleDisplaying 5 results
utf-8 (65001, Unicode (UTF-8)) -> windows-1250 (1250, Central European (Windows))
utf-8 (65001, Unicode (UTF-8)) -> Windows-1252 (1252, Western European (Windows))
utf-8 (65001, Unicode (UTF-8)) -> windows-1254 (1254, Turkish (Windows))
utf-8 (65001, Unicode (UTF-8)) -> windows-1256 (1256, Arabic (Windows))
utf-8 (65001, Unicode (UTF-8)) -> windows-1258 (1258, Vietnamese (Windows))
---
Solution:
So that leaves us with the main question: do admins understand this issue, and is there a plan to fix it?
Edit: Relevant post: https://devblog.songkick.com/the-great- ... 73f2ec631d