Category
page 1Corpora
Tatoeba
Tatoeba is a free collection of example sentences with translations geared towards foreign language learners. It is available in more than 400 languages. Its name comes from the Japanese phrase , meaning 'for example'. It is written and maintained by a community of volunteers through a model of open collaboration. Individual contributors are known as "Tatoebans". It is run by Association Tatoeba, a French non-profit organization funded through donations.
Corpus Scriptorum Historiae Byzantinae
monumental fifty-volume series of primary sources for the study of Byzantine history
speech corpus
speech audio files and text transcriptions
Thesaurus Linguae Graecae
digital corpus of pre-1500 Greek literature
British National Corpus
100-million-word text corpus of samples of written and spoken English from a wide range of sources
Brown Corpus
data set of American English in 1961
Corpus Coranicum
Research project of the Berlin-Brandenburg Academy of Sciences and Humanities
Indigenous Tweets
website tracking people who use Twitter in indigenous and minority languages
Quranic Arabic Corpus
annotated linguistic resource
Corpus of Contemporary American English
a more than 560-million-word corpus of American English