Now, if, in addition to those four conditions, the error has the same variance given any values for the regressors (i.e., the error is homoscedastic), then the OLS estimators are the best (i.e., having the narrowest sampling distribution) among all the linear unbiased estimators. This means that if those five conditions are satisfied, you cannot find an estimator that is unbiased AND has a narrower sampling distribution than the OLS. These five conditions, which are typically referred to as Gauss-Markov conditions, do not limit the distribution of the errors. If we restrict that distribution to normal, it can be shown that OLS estimators are the best among all unbiased estimators (not limited to linear anymore).

Making the normality of the errors assumption has another positive consequence. It makes the sampling distribution of the OLS estimators normal even when the sample size is not large enough. (Note that, due to the CLT and because OLS estimators involve complicated use of sample averages, if the sample size is large enough, the sampling distribution of the estimators will be normal regardless of the distribution of the errors.). Having normal distribution for the OLS estimators results in the distribution of t and F statistics become, respectively, t and F distributions. Knowing the estimates' sampling distributions allows us to make a statistical inference (generate prediction intervals and confidence levels).

Covers: theory of Gauss-Markov Assumptions
Estimated time needed to finish: 9 minutes
Questions this item addresses:
  • Why do we need homoscedasticity and error normality assumptions?
0 comment

Linear regression

Total time needed: ~37 minutes
you learn the assumptions behind linear regression and, more importantly, what will occur if those assumptions are violated.
Potential Use Cases
refreshing knowledge on linear regression, preparing for job interview questions
Who is This For ?
BEGINNERpeople entering the field of data science and data scientists who think stat is their weak point
Click on each of the following annotated items to see details.
WRITEUP 1. What is linear regression
  • What is linear regression?
10 minutes
WRITEUP 2. Ordinary Least Squares
  • What is Ordinary Least Squares?
10 minutes
WRITEUP 3. Homoscedasticity and error normality
  • Why do we need homoscedasticity and error normality assumptions?
9 minutes
ARTICLE 4. Potential business consequences of violation in GM assumptinos
  • What are some of the consequences of violation in GM assumptinos?
8 minutes

Concepts Covered

0 comment