branch of the Indo-Iranian languages in the Indo-European language family
Indo-Aryan is a major branch of languages that includes Hindi, Bengali, Punjabi, and many others spoken across South Asia and beyond. It matters because it's spoken by hundreds of millions of people today and provides insights into the history and connections of human languages across Europe and Asia.
AI-generated from the Wikipedia summary — may contain errors.
The Indo-Aryan languages (or sometimes Indic languages) are a branch of the Indo-Iranian languages in the Indo-European language family. As of the early 21st century, there were 800 million speakers, primarily concentrated east of the Indus River in South Asia, spread across Eastern Pakistan, Northern India, southern Nepal, Bangladesh, Sri Lanka, and Maldives. Moreover, apart from the Indian subcontinent, large immigrant and expatriate Indo-Aryan–speaking communities live in Northwestern Europe, Western Asia, North America, the Caribbean, Southeast Africa, Polynesia and Australia, along with several million speakers of Romani languages primarily concentrated in Southeastern Europe. There are in the vicinity of 200 Indo-Aryan languages.
Proto-Indo-Aryan was very close to Vedic Sanskrit, though some of the later Prakrits retain features that had been lost from Vedic Sanskrit, showing that they had a separate descent from Proto-Indo-Aryan. The largest such languages in terms of first-speakers are Hindustani (Hindi/Urdu) (c. 330 million), Bengali (242 million), Punjabi (about 150 million), Marathi (112 million), and Gujarati (60 million). A 2005 estimate placed the total number of native speakers of the Indo-Aryan languages at nearly 900 million people. Other estimates are higher, suggesting a figure of 1.5 billion speakers of Indo-Aryan languages.
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).