Chapter 3: Describing Relationships

Loading audio…

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

If there is an issue with this chapter, please let us know → Contact Us

The presentation begins with scatterplots as the foundational tool for exploring bivariate data, where students learn to identify three critical features: direction indicates whether variables move together positively or in opposite directions negatively, form describes whether the relationship appears linear or follows a curved pattern, and strength measures how tightly the data points cluster around an underlying trend. Students must also recognize outliers as unusual observations that deviate from the overall pattern and distinguish influential points that substantially alter statistical conclusions when included or excluded. The chapter then introduces correlation as a standardized numerical summary that quantifies the strength and direction of linear association between two variables, ranging from negative one to positive one, while emphasizing that correlation measures association only and cannot establish causality regardless of its magnitude. Building on this foundation, the chapter explores least-squares regression as a method for modeling linear relationships and making predictions, covering the calculation and interpretation of slope as the expected change in the response variable per unit increase in the explanatory variable and intercept as the predicted value when the explanatory variable equals zero. Residuals emerge as a central concept, representing the differences between actual observed values and values predicted by the regression equation, with residual plots serving as diagnostic tools to evaluate whether the linear model adequately fits the data. The coefficient of determination, expressed as r squared, quantifies what proportion of variation in the response variable is explained by the explanatory variable through the regression model. The chapter concludes by addressing critical limitations including the dangers of extrapolation beyond the range of observed data, the distorting effects of outliers and high leverage points on regression lines, and the importance of assessing whether linear regression is an appropriate modeling choice for a given dataset before drawing conclusions.