Also known as cluster analysis
problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known
统计分类是机器学习非常重要的一个组成部分,它的目标是根据已知样本的某些特征,判断一个新的样本属于哪种已知的样本类。分类是监督学习的一个实例,根据已知训练集提供的样本,通过计算选择特征参数,建立判别函数以对样本进行的分类。与之相对的是無監督學習,例如聚类分析。
Abstract from DBpedia / Wikipedia · CC BY-SA
via Wikidata sitelinks · CC0
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).