Dear Maintainer, I tried printing cat emoticons (U+1F638 to U+1F640). I got something else: ; /usr/lib/plan9/bin/unicode 1F638-1F640 1f638 1f639 1f63a 1f63b 1f63c 1f63d 1f63e 1f63f 1f640 The unicode(1plan9) tool seems to assume that codepoints have max. two bytes. ; /usr/lib/plan9/bin/unicode F638-F640 f638 f639 f63a f63b f63c f63d f63e f63f f640 Further evidence: ; /usr/lib/plan9/bin/unicode 41-50 0041 A 0042 B 0043 C 0044 D 0045 E 0046 F 0047 G 0048 H 0049 I 004a J 004b K 004c L 004d M 004e N 004f O 0050 P ; /usr/lib/plan9/bin/unicode 10041-10050 10041 A 10042 B 10043 C 10044 D 10045 E 10046 F 10047 G 10048 H 10049 I 1004a J 1004b K 1004c L 1004d M 1004e N 1004f O 10050 P ; /usr/lib/plan9/bin/unicode 20041-20050 20041 A 20042 B 20043 C 20044 D 20045 E 20046 F 20047 G 20048 H 20049 I 2004a J 2004b K 2004c L 2004d M 2004e N 2004f O 20050 P This is 𝐮𝐧𝐚𝐜𝐜𝐞𝐩𝐭𝐚𝐛𝐥𝐞.
I have written a replacement for unicode(1) in Bourne Shell. It seems to do the right thing for astral plance characters: --- snib --- ; unicode 1F638-1F640 1F638 😸 1F639 😹 1F63A 😺 1F63B 😻 1F63C 😼 1F63D 😽 1F63E 😾 1F63F 😿 1F640 🙀 --- snab ------ snob --- ; unicode 10041-10050 10041 𐁁 10042 𐁂 10043 𐁃 10044 𐁄 10045 𐁅 10046 𐁆 10047 𐁇 10048 𐁈 10049 𐁉 1004A 𐁊 1004B 𐁋 1004C 𐁌 1004D 𐁍 1004E 1004F 10050 𐁐 --- sneb ---