過剰適合

thumb|300px|Figure 1.  The green line represents an overfitted model and the black line represents a regularized model. While the green line best follows the training data, it is too dependent on that data and is likely to have a higher error rate on new unseen data, illustrated by black-outlined dots, compared to the black line. thumb|300x300px|Figure 2.  Noisy (roughly linear) data is fitted to a linear function and a polynomial function. Although the polynomial function is a perfect fit, the linear function can be expected to generalize better: If the two functions were used to ex

Article · 日本語

過剰適合（かじょうてきごう、英: overfitting）や過適合（かてきごう）や過学習（かがくしゅう、英: overtraining）とは、統計学や機械学習において、訓練データに対して学習されているが、未知データ（テストデータ）に対しては適合できていない、汎化できていない状態を指す。汎化能力の不足に起因する。その原因の一つとして、統計モデルへの適合の媒介変数が多すぎる等、訓練データの個数に比べて、モデルが複雑で自由度が高すぎることがある。不合理で誤ったモデルは、入手可能なデータに比較して複雑すぎる場合、完全に適合することがある。対義語は過少適合（かしょうてきごう、英: underfitting）や過小学習（かしょうがくしゅう、英: undertraining）。

Abstract from DBpedia / Wikipedia · CC BY-SA