UCS-2 vs UTF-16

I always used to get confused between UCS-2 and UTF-16. Which one’s the fixed-width encoding and which one’s the variable-length encoding that supports surrogate pairs?

Then, I learnt this simple little mnemonic: you know that UTF-8 is variable-length encoded1. UTF = variable-length. Therefore UTF-16 is variable-length encoded, and therefore UCS-2 is fixed-length encoded. (Just don’t extend this mnemonic to UTF-32.)

Just thought I’d pass that trick on.

1 I’m assuming you know what UTF-8 is, anyway. If you don’t, and you’re a programmer, you should probably learn sometime…

blog comments powered by Disqus