We can see how outliers negatively influence the fit of the regression line in the second plot.To identify influential points in the second dataset, we can can calculate #fit the linear regression model to the dataset with outliers#find Cook's distance for each observation in the dataset
A data point that has a large value for Cook's Distance indicates that it strongly influences the fitted values.
levels of Cook's distance at which to draw contours.logical indicating if a smoother should be added to Cook’s Distance Cook’s distance is a measure computed with respect to a given regression model and therefore is impacted … A logical variable to indicate whether to print graph in a new window. Ein Cook’s Distance Plot der anzeigt, ob es ”einflussreiche” Datenpunkte gibt (also Datenpunkte ohne die das Ergebnis signifikant anders w¨are). Details Cook's distance and leverage are used to detect highly influential data points, i.e. An integer indicating the number of top Cook's distances to be labelled in the plot. Plot cook's distance graph.
In a practical ordinary least squares analysis, Cook's distance can be used in several ways: to indicate influential data points that are particularly worth checking for validity; or to indicate regions of the design space where it would be good to be able to obtain more data points.

The formula for Cook's distance is: D i = (r i 2 / p*MSE) * (h ii / (1-h ii) 2) where: r i is the i th residual; p is the number of coefficients in the regression model; MSE is the mean squared error

In statistics, Cook's distance or Cook's D is a commonly used estimate of the influence of a data point when performing a least-squares regression analysis. It is named after the American statistician R. Dennis Cook, who introduced the concept.