Discussion about this post

User's avatar
Duncan Black's avatar

Good article. I will say a couple things. One mathematical Statistics cant fix bad data (This should be obvious). It can only look for patterns in data. The linear regression example is actually a much worse problem when you look at high dimensional data. That is when it is 2d its very easy to look at and see if things make sense. If I have 100D data I can’t do that easily. In fact with high dimension some very bad things can happen. There are tools to deal with this though. There is a concept called leverage in Linear regression. That outlier point has massive leverage and would call the regression into question instantly. In fact keep in mind that the equation for linear regression is pretty meaningless unless you make some assumptions of the error terms. If that assumption is wrong you can be in trouble. How much trouble kinda depends

Expand full comment
2 more comments...

No posts