
thumb|upright=1.35|right|Most syntactic treebanks annotate variants of either Phrase structure grammar|phrase structure (left) or dependency structure (right).
thumb|upright=1.35|right|Most syntactic treebanks annotate variants of either Phrase structure grammar|phrase structure (left) or dependency structure (right).
In linguistics, a treebank is a parsed text corpus that annotates syntactic or semantic sentence structure. The construction of parsed corpora in the early 1990s revolutionized computational linguistics, which benefitted from large-scale empirical data.
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).