Regression and Correlation Methods: Correlation, ANOVA, and Least Squares

 

 

 

This is another way of assessing the possible association between a normally distributed variable y and a categorical variable x. These techniques are special cases of linear regression methods. The purpose of the assignment is to demonstrate methods of regression and correlation analysis in which two different variables in the same sample are related.
The following are three important statistics, or methodologies, for using correlation and regression:
• Pearson’s correlation coefficient
• ANOVA
• Least squares regression analysis
In this assignment, solve problems related to these three methodologies.
Part 1: Pearson’s Correlation Coefficient
For the problem that demonstrates the Pearson’s coefficient, you will use measures that represent characteristics of entire populations to describe disease in relation to some factor of interest, such as age; utilization of health services; or consumption of a particular food, medication, or other products. . Pearson’s correlation measures the relationship between two variables that are continuous and are often expressed as numbers using interval or ratio measurement. For example, the relationship between weight in pounds and systolic blood pressure (the top number in a blood pressure reading). To describe a pattern of mortality from coronary heart disease (CHD) in year X, hypothetical death rates from ten states were correlated with per capita cigarette sales in dollar amount per month. Death rates were highest in states with the most cigarette sales, lowest in those with the least sales, and intermediate in the remainder. Observation contributed to the formulation of the hypothesis that cigarette smoking causes fatal CHD. The correlation coefficient, denoted by r, is the descriptive measure of association in correlational studies.
Table 1: Hypothetical Analysis of Cigarette Sales and Death Rates Caused by CHD
State Cigarette sales Death rate
1 102 5
2 149 6
3 165 6
4 159 5
5 112 3
6 78 2
7 112 5
8 174 7
9 101 4
10 191 6
Refer to following video on how to calculate Pearson’s correlation coefficient using SPSS:
(https://www.youtube.com/watch?v=6EH5DSaCF_8)
Next, using SPSS:
• Calculate Pearson’s correlation coefficient.
• Create a two-way scatter plot.
In addition to the above:
• Explain the meaning of the resulting coefficient, paying particular attention to factors that affect the interpretation of this statistic, such as the normality of each variable.
• Provide a written interpretation of your results in APA format.
Refer to the Correlation and Regression Methods module to view resources related to regression analysis.

Let’s take hypothetical data presenting blood pressure and high fat intake (less than 3 grams of total fat per serving) or low fat intake (less than 1 gram of saturated fat) of an individual.
Table 2: Blood Pressure and Fat Intake
Individual Blood Pressure Fat Intake
1 135 1
2 130 1
3 135 1
4 128 0
5 121 0
6 133 0
7 145 1
8 137 1
9 148 1
10 134 0
11 150 0
12 121 0
13 117 1
14 128 1
15 121 0
16 124 1
17 132 0
18 121 0
19 120 0
20 124 0

Refer to the following video on how to calculate a one-way ANOVA using SPSS.
(https://www.youtube.com/watch?v=OEOeXpxSjf8
Next, using SPSS:
• Calculate a one-way ANOVA to test the null hypothesis that the mean of each group is the same.
• Use different variables as grouping variables (fat intake high 1; fat intake low 0) and compare the results.
• Calculate an F-test for an overall comparison of means to see whether any differences are significant.
In addition, in a Microsoft Word document, provide a written interpretation of your results in APA format.
Refer to the Correlation and Regression Methods module to view resources related to one-way ANOVA calculations.
Submission Details:
• Name your SPSS output file SU_PHE5020_W4_A2c_LastName_FirstInitial.spv.
• Name your document SU_PHE5020_W4_A2d_LastName_FirstInitial.doc.
• Submit your document to the Submissions Area by the due date assigned.
Part 3: Least Squares
The following are hypothetical data on the number of doctors per 10,000 inhabitants and the rate of prematurely delivered newborns for different countries of the world.
Table 3: Number of Doctors Verses the Rate of Prematurely Delivered Newborns
Country Doctors per 100,000 Early births per 100,000
1 3 92
2 5 88
3 5 85
4 6 86
5 7 89
6 7 75
7 7 70
8 8 68
9 8 69
10 10 50
11 12 45
12 12 41
13 15 38
14 18 35
15 19 30
16 23 6
Refer to the following video on how to calculate linear regression (aka least squares) using SPSS.
(https://www.youtube.com/watch?v=GQP47ijt4LI&t=559s)
Using SPSS:
• Apply least squares analysis to fit a regression line to the data.
• Calculate an F-test and a t-test to test for the significance of the regression.
• Test for goodness of fit using R2.

Sample Solution

Part 1: Pearson’s Correlation Coefficient

 

Step-by-Step SPSS Guide:

  1. Enter the data: Open a new SPSS data file and enter the “Cigarette sales” and “Death rate” data from Table 1 into two separate columns.
  2. Calculate the coefficient: Go to Analyze > Correlate > Bivariate. Move both variables to the Variables box. Make sure Pearson is checked under Correlation Coefficients. Click OK.
  3. Create the scatter plot: Go to Graphs > Legacy Dialogs > Scatter/Dot. Select Simple Scatter and click Define. Move “Cigarette sales” to the X-axis and “Death rate” to the Y-axis. Click OK.

Interpretation:

  • Meaning of the Coefficient (r): The Pearson correlation coefficient, r, measures the strength and direction of a linear relationship between two variables. The value of r ranges from -1 to +1.
    • A value close to +1 indicates a strong positive linear relationship (as one variable increases, the other tends to increase).
    • A value close to -1 indicates a strong negative linear relationship (as one variable increases, the other tends to decrease).
    • A value close to 0 indicates a weak or no linear relationship.
  • Factors Affecting Interpretation: It’s important to remember that correlation does not imply causation. The strength of the correlation can also be affected by outliers, and the assumption of normality for both variables should be considered.
  • APA Format Interpretation: Your written interpretation should include:
    • A sentence stating the type of test conducted (e.g., “A Pearson’s correlation coefficient was calculated to assess the relationship between cigarette sales and death rates.”).
    • The value of the correlation coefficient, r, and the p-value.
    • A description of the strength and direction of the relationship (e.g., “There was a strong, positive correlation between the two variables, r = [insert value], p = [insert value]”).
    • A brief discussion of what this means in the context of the problem, while being careful not to claim causation.

 

Part 2: One-Way ANOVA

 

Step-by-Step SPSS Guide:

  1. Enter the data: Open a new SPSS data file and enter the “Blood Pressure” and “Fat Intake” data from Table 2 into two columns.
  2. Define the groups: Go to Variable View and change the Fat Intake variable to have a label for each value (e.g., Value 0 = “Low Fat Intake,” Value 1 = “High Fat Intake”). This defines your groups.
  3. Calculate the ANOVA: Go to Analyze > Compare Means > One-Way ANOVA. Move “Blood Pressure” to the Dependent List box and “Fat Intake” to the Factor box. Click Options and check Descriptive and Homogeneity of variance test. Click Continue and then OK.

Interpretation:

  • Null Hypothesis: The null hypothesis for a one-way ANOVA is that the means of all groups are equal. In this case, it is that the mean blood pressure for the high fat intake group is the same as for the low fat intake group.
  • F-test and P-value: The F-statistic tests the overall comparison of means. A significant p-value (typically less than .05) indicates that you should reject the null hypothesis, meaning there is a statistically significant difference between the group means.
  • APA Format Interpretation:
    • State the purpose of the test (e.g., “A one-way ANOVA was conducted to compare the mean blood pressure for individuals with high fat intake versus low fat intake.”).
    • Report the F-statistic, degrees of freedom, and p-value (e.g., “The results showed a significant difference between the two groups, F([df between], [df within]) = [insert F-value], p = [insert p-value]”).
    • Discuss the direction of the difference (e.g., “Post-hoc comparisons revealed that the mean blood pressure was significantly higher for the [insert group] group…”).

 

Part 3: Least Squares Regression Analysis

 

Step-by-Step SPSS Guide:

  1. Enter the data: Open a new SPSS data file and enter the data from Table 3 into two columns: “Doctors per 100,000” and “Early births per 100,000.”
  2. Run the regression: Go to Analyze > Regression > Linear. Move “Early births per 100,000” to the Dependent box and “Doctors per 100,000” to the Independent(s) box.
  3. Perform the tests: The default output will include the F-test, t-test, and R-squared values. Click Statistics and ensure Estimates and Model fit are checked. Click Continue and then OK.

Interpretation:

  • F-test and t-test:
    • The F-test for the overall regression model assesses whether the independent variable (doctors per 100,000) significantly predicts the dependent variable (early births per 100,000). A significant p-value indicates a good model fit.
    • The t-test for the coefficient of the independent variable tests whether the slope of the regression line is significantly different from zero, which is another way of assessing the significance of the relationship.
  • R-squared (): This value, also known as the coefficient of determination, indicates the proportion of the variance in the dependent variable that can be predicted from the independent variable. For example, an of 0.75 means that 75% of the variability in early births can be explained by the number of doctors per 100,000 inhabitants. A higher value indicates a better-fitting model.

This question has been answered.

Get Answer
WeCreativez WhatsApp Support
Our customer support team is here to answer your questions. Ask us anything!
👋 Hi, Welcome to Compliant Papers.