Influential data point



A data point is considered influential if its exclusion causes substantial changes in the estimated regression.

Outliers and high leverage points may not necessarily be influential. They are just potentially influential. The Cook’s distance is a well-known metric for identifying influential data points

If we remove this influential data point from the regression analysis and re-estimate the regression model, the regression may be found to be of a better fit, and the p-values of each coefficient may be lower, indicating a higher confidence in the regression fit. 

See also: Influence analysis, Influence plot

Influential point