next video: • video 13.3. prediction equations
prior video: • video 13.1. simple regression
closed captioning text:
Now we are going to talk about the very last parametric statistical test for this whole class. It is called "multiple regression." It is like simple regression, except for with multiple regression, you have more than one predictor variables predicting one outcome variable.
So you could use this when you have multiple predictors of one outcome variable. It is always one outcome variable for a multiple regression, and this dependent variable for multiple regression should be scale (should be interval or ratio). The predictors don't need to be scale, though. They can be grouping variables or anything like that.
So, an example for when you have multiple predictors would be predicting exam grades from hours studying, like we did for simple regression, but now we could also include attendance. How many days that you attended lecture. Now we have two variables that are predicting exam grades. We will probably be able to do a better job predicting exam grades because we have more information. We have this additional attendance information, which will probably predict exam grades in ways that hours studying didn't, because you hopefully learn stuff in lecture, as well as at home studying.
So here, the null hypotheses would be... There would be one for the intercept, so that we would say a = 0, and then there would be another null hypothesis for hours studying, the slope for studying is 0, and a null hypothesis for attendance, and the slope for that is 0. So all that this would mean is that there is no correlation between hours studying and exam grades, in this case, controlling for attendance. We will get into more of what "controlling for a variable" means in a little bit. This one down here, so the null hypothesis is that there is no relationship, the correlation is zero ... [really] the slope is zero for attendance and exam grades, when you are controlling for studying. So this is one context where you would use multiple regression, when you have multiple predictors, and you are trying to predict one outcome variable. Statisticians draw this relationship where you have multiple predictors predicting one outcome variable like this. This is one variable. It is predicting the outcome variable, and this is another variable predicting the outcome variable. So, in our case, we are predicting grades from studying and attendance. You have studying predicting grades, and attendance predicting grades. Here, you have multiple predictors predicting the one outcome variable.
Another use that statisticians have for multiple regression is to see if one variable mediates the relationship between two other variables. I will say that this use is for testing mediation. For this, let's go through an example. That might make it easier. Let us say we already know that studying predicts grades, because we did a simple regression, and there was a statistically significant relationship between the two. Now what we are wondering, is the reason that studying predicts grades is because studying, that extra studying, maybe gets them excited about the topic and so they attend class more. When they are in class, they see the lectures, and they learn from those lectures and it is actually THIS thing that causes an increase in the grades. This is how you would draw a mediation relationship, where studying increases attendance, and attendance increases grades. This would be called the "mediator," and this is the variable that is causally between these two. So, a more plausible example might be drinking caffeine increases alertness, and increased alertness causes better detection of, I don't know, hidden pictures in an image. Statisticians use multiple regression to test for mediation, to see if one variable goes between two other variables. We are not going to get in depth about how that is done in this class, but it is a common use for multiple regression.
[closed captioning text continued in the comments]
Информация по комментариям в разработке