Studentized residual

PrepNuggets

LEVEL II

To identify outliers, the preferred method is to use studentized residuals. For each data point i, the point is deleted and the regression model is re-estimated with the remaining data points. The residual of this data point is its Y-value minus the predicted Y-value from the regression. This is repeated for all the data points, and we are able to determine the standard deviation of the residuals.

For each particular data point, its residual over the standard deviation is its studentized residual. This reflects the number of standard deviations that the data point is away from the regression line.  To test if a particular point is a outlier, its studentized residual should be compared to the critical value of the T-distribution statistic with n-k-2 degrees of freedom. Points that fall in the rejection region are termed outliers, and they are potentially influential.

See also: Influence analysis, High leverage, Cook’s Distance, Influence plot