Sunday, 20 November 2011

Assumptions of the Multiple Linear Regression Model

The model is based on 6 assumptions. When these assumptions are hold the regression estimators are unbiased, efficient & consistent.

  • Unbiased means that the expected value of the estimator is equal to the true value of the parameter.
  • Efficient means that the estimator has a smaller variance.
  • Consistent means that the bias and variance of the estimator approach zero as the sample size gets larger.

1. The relationship between the dependent variable, Y, and the independent variables, X1, X2... Xk, is linear.

2. The independent variables are not random & no exact linear relation (perfect 1) exists between two or more of the independent variables.

3. The expected value of the error term, conditional on the independent variables is 0.

4. The variance of the error term is constant for all observations. i.e. errors are Homoskedastic

5. The error term is uncorrelated across observations (i.e. no serial correlation).

6. The error term is normally distributed.

It is important to note that Linear regression can't be estimated when an exact linear relationship exists between two or more independent variables. But when two or more independent variables are highly correlated, although there is no exact relationship, then it leads to multicollinearity  problem. Even if independent variable is random but uncorrelated with the error term, regression results are reliable.

1 comment:

  1. I believe that the last assumption you indicate (about normally distributed errors) is only true, strictly speaking, for the optimality of regressions which minimize the squared error.

    Regardless, one may apply least square linear regression whether or not several of these assumptions are violated. Whether a regression is useful is not the same thing as whether it is theoretically optimal.