problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).