1)You are designing a regression equation to predict the final grade in a class on a 0-100% scale. What variables do you include and why? (minimum of four variables)
2)What bias have you introduced to #1 above?
3)What do you expect the sign of the coefficients to be from #1?
4)You are presented with a variable that is in percent format (e.x. percent passing a course). You are asked to create a regression using that variable. Should you transform the variable why or why not?
5)How does small number bias influence the decision to include or exclude a variable?
6)What is survivorship bias?
7)You design a survey to calculate the amount of groceries that people buy. You stand outside of the wholefoods in Cambridge, MA and ask people to survey data. You are asked to predict the amount of food production for someone living within a ¼ mile of Northeastern University. Can you use your survey data for this purpose?
8)How can you use Cookâs Distance in variable error analysis?
9)Describe the different between a tool and a weapon when it comes to data.
10)Stereotypes can often be a problem of correlation vs causation coupled with external validity. Explain why.