TECHNOLOGY
What is an outlier? - A data point that's very different from the rest
What is a point outlier? - Values that are far from the rest of the data
What is a contextual outlier? - A value that isn't far from the rest overall, but is far from points
nearby in time (think: untimely spike in time series data)
What is a collective outlier? - When something is missing in a range of points, but we can't tell
exactly where
What should we do with an outlier if it is caused by bad data? - Either remove it or replace it with
an imputed value
What should we do with an outlier if it is real/correct data? - Carefully consider whether to keep
it in the model or throw it out. You can also create a logistic model to estimate the probability of
outliers happening under certain conditions.