
Also known as 16-bit Unicode Transformation Format, Unicode Transformation Format – 16-bit, Unicode Transformation Format - 16-bit, UTF16, UTF_16
UTF-16 (16-bit Unicode Transformation Format) is a character encoding that supports all 1,112,064 valid code points of Unicode. The encoding is variable-length as code points are encoded with one or two code units. UTF-16 arose from an earlier obsolete fixed-width 16-bit encoding now known as UCS-2 (for 2-byte Universal Character Set), once it became clear that more than 216 (65,536) code points were needed, including most emoji and important CJK characters such as for personal and place names.
via Wikipedia infobox
via Wikidata sitelinks · CC0
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).