thumb|400px|right|Several sets of (x, y) points, with the Pearson correlation coefficient of x and y for each set. The correlation reflects the noisiness and direction of a linear relationship (top row), but not the slope of that relationship (middle), nor many aspects of nonlinear relationships (bottom). N.B.: the figure in the center has a slope of 0 but in that case, the correlation coefficient is undefined because the variance of Y is zero.
Correlation measures how closely two variables move together in a linear fashion, telling you both the strength and direction of their relationship. It matters because it helps you quickly identify whether two things tend to rise and fall together or move in opposite directions, though it won't tell you how steep that relationship is or whether the connection is actually caused by one variable affecting the other.
AI-generated from the Wikipedia summary — may contain errors.
thumb|400px|right|Several sets of (x, y) points, with the Pearson correlation coefficient of x and y for each set. The correlation reflects the noisiness and direction of a linear relationship (top row), but not the slope of that relationship (middle), nor many aspects of nonlinear relationships (bottom). N.B.: the figure in the center has a slope of 0 but in that case, the correlation coefficient is undefined because the variance of Y is zero.
In statistics, correlation is a kind of statistical relationship between two random variables or bivariate data. Usually it refers to the degree to which a pair of variables are linearly related. In statistics, more general relationships between variables are called an association, the degree to which some of the variability of one variable can be accounted for by the other.
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).