language with several interacting codified standard versions
A pluricentric language or polycentric language is a language with several codified standard forms, often corresponding to different countries. Many examples of such languages can be found worldwide among the most-spoken languages, including but not limited to Chinese in China, Taiwan, and Singapore; English in the United States, United Kingdom, Canada, Australia, New Zealand, Ireland, South Africa, India, Singapore, and elsewhere; and French in France, Canada, many African countries, and elsewhere.
The converse case is a monocentric language, which has only one formally standardized version. Examples include Japanese and Russian. In some cases, the different standards of a pluricentric language may be elaborated to appear as separate languages, e.g. Malaysian Malay and Indonesian (together termed as Malay), Hindi and Urdu (together termed as Hindustani), while Serbo-Croatian is in an earlier stage of that process.
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).