After studying and reviewing the concepts and practices of correlation and simple linear regression, solve the following exercise using Excel:
1. The following sample observations were randomly selected:
X: 5 3 6 3 4 4 6 8
Y: 13 15 7 12 13 11 9 5
2. Consider the following aspects when submitting your test:
•
o Graph the scatter plot or dispersion
o Determine the correlation coefficient.
o Determine the regression equation.
o Find the value of the dependent variable when X = 7.
o Is the regression model adequate? Justify your answer. Which test determines this?
3. Present your answers with the aspects learned so far about correlation and simple linear regression.
1. Graph the scatter plot or dispersion
To graph the scatter plot in Excel, follow these steps:
2. Determine the correlation coefficient
The correlation coefficient is a measure of the linear relationship between two variables. It can range from -1 to 1, with a value of 1 indicating a perfect positive correlation, a value of -1 indicating a perfect negative correlation, and a value of 0 indicating no correlation.
To determine the correlation coefficient in Excel, follow these steps:
The correlation coefficient will be displayed in the Output Range.
3. Determine the regression equation
The regression equation is a mathematical equation that describes the relationship between two variables. It can be used to predict the value of the dependent variable (Y) given the value of the independent variable (X).
To determine the regression equation in Excel, follow these steps:
The regression equation will be displayed in the Output Range.
4. Find the value of the dependent variable when X = 7
To find the value of the dependent variable when X = 7, we can use the regression equation.
The regression equation for this data set is:
Y = 1.23X - 1.47
Substituting X = 7 into the equation, we get:
Y = 1.23(7) - 1.47 = 6.06
Therefore, the value of the dependent variable when X = 7 is 6.06.
5. Is the regression model adequate? Justify your answer. Which test determines this?
To determine if the regression model is adequate, we can look at the R-squared value. The R-squared value is a measure of how well the regression model fits the data. It can range from 0 to 1, with a value of 1 indicating that the model perfectly fits the data.
The R-squared value for this data set is 0.74. This indicates that the regression model explains 74% of the variation in the dependent variable. This is a good fit, but it is not perfect.
Another way to assess the adequacy of the regression model is to look at the residual plot. The residual plot is a graph of the residuals against the independent variable. The residuals are the difference between the actual values of the dependent variable and the predicted values of the dependent variable.
If the residual plot shows a random pattern, then the regression model is adequate. If the residual plot shows a non-random pattern, then the regression model is not adequate.
The residual plot for this data set does not show any obvious non-random patterns. This suggests that the regression model is adequate.
Conclusion
The regression model for this data set is adequate. It explains a good portion of the variation in the dependent variable and the residual plot does not show any obvious non-random patterns.
The test that determines if the regression model is adequate is the F-test. The F-test tests the null hypothesis that the regression coefficients are equal to zero. If the F-test is significant, then we reject the null hypothesis and conclude that the regression model is adequate.
The F-test is significant for this data set. This means that the regression model is adequate.