Skip to content
Category

Robust statistics

page 1
median
thumb|Calculating the median in data sets of odd (above) and even (below) observations The median of a set of numbers is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as the “middle" value. The basic feature of the median in describing data compared to the mean (often simply described as the "average") is that it is not skewed by a small proportion of extreme values, and therefore provides a better representation of the center. Median income, for example, may be a better way to descri
outlier
thumb|Figure 1. Box plot of data from the [[Michelson–Morley experiment displaying four outliers in the middle column, as well as one outlier in the first column.]]
non-parametric statistics
branch of statistics that is not based solely on parametrized families of probability distributions
robust statistics
statistics with good performance for data drawn from a wide range of probability distributions
truncated mean
statistical measure of central tendency
RANSAC
statistical method
median absolute deviation
median of the absolute deviation from the median; a robust measure of the variability of a univariate sample of quantitative data
winsorized mean
statistical measure of central tendency that takes the mean of a winsorized dataset
k-medoids
The -medoids method is a classical partitioning technique of clustering that splits a data set of objects into clusters, where the number of clusters is assumed to be known a priori (which implies that the programmer must specify k before the execution of a -medoids algorithm). The "goodness" of the given value of can be assessed with methods such as the silhouette method. The name of the clustering method was coined by Leonard Kaufman and Peter J. Rousseeuw with their PAM (Partitioning Around Medoids) algorithm.
Huber loss function
loss function used in robust regression
Dixon's Q test
criterion for identification and rejection of outliers
Hodges–Lehmann estimator
robust and nonparametric estimator of a population's location parameter
winsorising
Winsorizing or winsorization is the transformation of statistics by limiting extreme values in the statistical data to reduce the effect of possibly spurious outliers. It is named after the engineer-turned-biostatistician Charles P. Winsor (1895–1951). The effect is the same as clipping in signal processing.
Least absolute deviations
statistical optimality criterion
trimean
In statistics the trimean (TM), or '''Tukey's trimean''', is a measure of a probability distribution's central tendency defined as a weighted average of the distribution's quartiles: