How convert string to Unicode?

Right, Unicode is funny that way. There are two ways to represent that (and other) characters. One is a single code point, and the other is using two (or more) code points.The former is “composed”, the latter “decomposed”. Both are right, but are a headache for us who have to deal with these things behind the scenes.

I put out a Unicode Normalization package, if you’re interested. That will convert composed strings to decomposed and vice versa.