This dataset contains 11 observations generated by Francis Anscombe to
demonstrate that statistical summary measures alone cannot capture the full
relationship between two variables (here, x
and y
). Anscombe emphasized
the importance of visualizing data prior to calculating summary statistics.
Details
This Dataset has a linear relationship between x
and y
with a single
outlier
Additionally, the following statistical summaries hold:
mean of
x
: 9variance of
x
: 11mean of
y
: 7.5variance of y: 4.125
correlation between
x
andy
: 0.816linear regression between
x
andy
:y = 3 + 0.5x
\(R^2\) for the regression: 0.67