LESSON: Regression

Site: Mountain Heights Academy OER
Course: Introductory Statistics Q3
Book: LESSON: Regression
Printed by: Guest user
Date: Friday, 4 April 2025, 11:28 AM

Intro to Regression Lines

After determining that a linear correlation exists between two variables AND that the correlation is significant, the next step is to determine the equation for the line that best models the data.

A regression line, also known as a line of best fit, is the line that best models the data. 

Watch the video below to see an example of how the regression line is found:

Linear Equations

In algebra, you learned that you can write an equation of any straight line in the form: y = mx + b. This equation is called Slope Intercept Form. The variable m represents the slope of the line and the variable b represents the y-intercept of the line (where the line crosses the y-axis).

In the line shown below, the equation of that line, in slope intercept form is y = 1/2 x + 1 because the slope of the line is 1/2 and the y-intercept is 1.

The equation of a regression line follows a similar form: \( \hat{y}=mx + b \) where \( \hat{y} \), pronounced y-hat, represents the predicted y-value. The variables m and b also represent the slope and y-intercept, respectively.

The difference between the methods you learned in algebra to find the equation of a line and the method for finding the equation of a regression line is that a regression line takes into consideration ALL the data points, x and y, in our data set to determine the equation \( \hat{y}=mx + b \).

Calculate Regression Equations

It's time to turn things over to Google Sheets and let it do some of this work for us.

Watch this video to see how to use Google Sheets to draw the regression line for a scatter plot and how to find the equation of the regression line.

Calculating Predicted Values

Now that you know how to find the equation of a regression line you may be asking yourself, what application does it have to my data set?

In this lesson video, you'll learn how we use the regression equation to calculate predicted values.