Multiple Linear Regression  Solutions
1 Relationship Between Eighth Grade IQ, Eighth Grade Abstract Reasoning and Ninth grade Math Score For a statistics class project, students examined the relationship between x_{1} = 8^{th} grade IQ, x_{2} = 8^{th} grade Abstract Reasoning and y = 9^{th} grade math scores for 20 students. The data are displayed below.
Student

Math Score

IQ

Abstract Reas

1

33

95

28

2

31

100

24

3

35

100

29

4

38

102

30

5

41

103

33

6

37

105

32

7

37

106

34

8

39

106

36

9

43

106

38

10

40

109

39

11

41

110

40

12

44

110

43

13

40

111

41

14

45

112

42

15

48

112

46

16

45

114

44

17

31

114

41

18

47

115

47

19

43

117

42

20

48

118

49

Use Minitab on the dataset Finals found in the Datasets folder in ANGEL. Do Stat>Regression>Regression and enter in the Response window the variable math score and in the Predictors window enter IQ and Abstract_Reas. Click ‘Storage’ and then ‘Residuals’ and ‘Fits’. These will be stored in columns C4 and C5 and named as RESI1 and FITS1. Your output should look as follows:
Regression Analysis: Math Score versus IQ, Abstract_Reas
The regression equation is
Math Score = 54.1  0.484 IQ + 1.02 Abstract_Reas
Predictor Coef SE Coef T P
Constant 54.05 22.99 2.35 0.031
IQ 0.4836 0.2955 1.64 0.120
Abstract_Reas 1.0185 0.2656 3.84 0.001
S = 3.00271 RSq = 70.5% RSq(adj) = 67.1%
Analysis of Variance
Source DF SS MS F P
Regression 2 366.92 183.46 20.35 0.000
Residual Error 17 153.28 9.02
Total 19 520.20
a. What is the regression equation and provide an interpretation of each slop in terms of the change in Y per unit change in X?
Math Score = 54.1  0.484 IQ + 1.02 Abstract_Reas
In multiple linear regression, the slope indicates “for a unit change in X_{i} while holding the other predictors constant (i.e. not changing), Y will change by the amount and direction of the slope for X_{i}”. So here, when holding abstract reasoning constant, for a 1 unit increase in IQ the predicted math score will decrease by 0.484 points; when holding IQ constant, for a 1 unit increase in Abstract Reasoning the predicted math score will increase by 1.02 points.
b. Create two scatter plots of the measurements by Graph > Scatter Plot > Simple, and select IQ as the predictor (xvariable) and math score as the response (yvariable) and enter math score again as a yvariable and enter Abstract Reas xvariable. Select Multiple Graphs and click the radio button for “In separate panels of the same graph”. Describe the relationship between math score, abstract reasoning and IQ.
There is a positive relationship between both the response variables and IQ (the explanatory variable). However, the slope coefficient for IQ in the regression model is negative! This occurs from how the coefficients are now calculated. In simple linear regression the estimates are related to how the X and Y variables are correlated. However, in multiple linear regression this simple correlation loses its relevance. Instead, a partial correlation comes into play.
c. Based on the output, what is the test of the slope for this regression equation? That is, provide the null and alternative hypotheses, the test statistic, pvalue of the test, and state your decision and conclusion.
Ho: B_{1} = 0 Ha: B_{1} ╪ 0 The test statistic is 1.64 with a pvalue of 0.120. Since this pvalue is greater than 0.05, we would NOT reject Ho. This means, that when Abstract Reasoning is already in the model, IQ is not a statistically significant linear predictor of ninth grade math scores.
Ho: B_{2} = 0 Ha: B_{2} ╪ 0 The test statistic is 3.84 with a pvalue of 0.001. Since this pvalue is less than 0.05, we would REJECT Ho. This means, that when IQ is already in the model, Abstract Reasoning is a statistically significant linear predictor of ninth grade math scores.
d. From the output, what is the meaning of the ANOVA Ftest? Provide the two hypotheses statements, decision and conclusion.
Ho: B_{1} = B_{2} = 0 and Ha at least one of these slopes does not equal zero.
With a pvalue of 0.000 and test statistic of 20.35, we reject Ho and conclude at least one of the slopes does not equal zero. NOTE: this rejection does not tell which slope(s) is/are significant. Just simply that at least one is significant.
e. Check assumptions of constant variance and normality by creating a Scatterplot under Graphs of the residuals versus each of the predictor variables. For the normality plot, see Graphs > Probability Plot > Single and graph the residuals. What are your conclusions based on these graphs?
Both scatterplots provide and indication of an outlier (bottom right of each figure) and the probability plot which is testing that the null hypothesis that the data comes from a normal distribution is rejected (pvalue less than 0.005) giving evidence that the data does not satisfy both assumptions of normality and constant variance. Handling possible outlier(s) in multiple linear regression is analogous to the methods used in simple linear regression.
