It makes you a better developer if you know the basis of UTF-8, UTF-16, Unicode, BIG5, GB2312 and things inside-out. As such, I am curating a list of articles which would be helpful to get you started.

This will be a growing list.


The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) – One of the best intro articles to unicode


Python 2.x has not been the friendliest introduction to unicode, so some readings:

Unicode In Python, Completely Demystified – Confused about UnicodeEncodeError and UnicodeDecodeError? No longer

Making Sense of Python Unicode – More examples in Python with unicode

Other Topics

Unicode Collation Algorithm – How locale-aware string comparisons work