On Regression To The Mean
05 January, 2022 - 6 min read
Periods of above averages are followed by below averages.
If you throw twenty darts at a target and manage to hit the bull’s-eye eighteen times, the next time you throw twenty darts, it probably won’t go well. There is no scientific reasoning behind it, but rather a natural behavior of randomness.
What goes up must come down and what goes down must come up.
An anomaly in statistics (an extreme value), will tend to be closer to average the next time it is measured. If, when measured a second time, the value is found to be more extreme, then the original value was likely closer.
“Regression” was discovered and named late in the nineteenth century by Sir Francis Galton, half cousin of Charles Darwin. Galton compared the height of children to that of their parents. He found that when the average height of the parents was greater than the mean of the population, the children were shorter than their parents. Likewise, when the average height of the parents was shorter than the population mean, the children were taller than their parents. Galton called this phenomenon regression toward mediocrity.
Regression to the mean is a statistical phenomenon. It can result in wrongly concluding that an effect is due to a variable when it is due to a chance. Ignorance of the problem can lead to errors in daily decision making process.
Mean is just another word for average. Regression to the mean explains why extreme events are usually followed by something more typical, regressing closer to the expected mean. For example, not every non-athletic swimmer can be expected to break records time after time. A repeat of a rare result is equally as rare as its first occurrence, such that it shouldn’t be expected the next time. We should never assume results based on a smaller set of observations. A small sample tells us very little beyond that what happened was within the range of possible outcomes. While first impressions can be accurate, we should treat them with skepticism. More data can help us distinguish what is likely from what is an anomaly.
Business & investing
Regression to the mean is powerful in business and investing. Periods of above averages are followed by below averages and vice-versa. As Charlie Munger said in Poor Charlie’s Almanack:
Mimicking the herd invites regression to the mean. — Charlie Munger
His investing partner Warren Buffett puts it in another way:
Most people get interested in stocks when everyone else is. The time to get interested is when no one else is. You can’t buy what is popular and do well. — Warren Buffett
Jason Zweig who is a successful financial journalist at Wall Street Journal writes:
My role, therefore, is to bet on regression to the mean even as most investors, and financial journalists, are betting against it. I try to talk readers out of chasing whatever is hot and, instead, to think about investing in what is not hot. Instead of pandering to investors’ own worst tendencies, I try to push back. My role is also to remind them constantly that knowing what not to do is much more important than what to do. Approximately 99% of the time, the single most important thing investors should do is absolutely nothing. — Jason Zweig
Peter Bevelin who is another notable and well respected investor shares his thoughts.
Regression to the mean is not a natural law. Merely a statistical tendency. And it may take a long time before it happens. — Peter Bevelin
Michael Mauboussin is a researcher in the investing world.
Understanding and using the phenomenon of reversion to the mean is essential in making sound predictions [decisions]… Reversion to the mean is most pronounced at the extremes, so the first lesson is to recognize that when you see extremely good or bad results, they are unlikely to continue that way. This doesn’t mean that good results will necessarily be followed by bad results, or vice versa, but rather that the next thing that happens will probably be closer to the average of all things that happen. — Michael Mauboussin
Regression to the mean can frequently be observed in sports. Let's take Michael Jordan for example, one of the best basketball players who has two talented sons, but nowhere near to Jordan's level. They were both talented, but didn't make it to the NBA.
The same phenomenon is justified for individual performances via speculation. We ought to remember chance plays a big role. A pro-runner sprinting at a record mileage against heavy wind will lead to mediocre results. And an average athlete running in favor of heavy wind will lead to spectacular results. But when there is no wind, things will even out over time. Nassim Taleb observes in his book Fooled by Randomness:
The ‘hot hand in basketball’ is another example of misperception of random sequences: It is very likely in a large sample of players for one of them to have an inordinately lengthy lucky streak. As a matter of fact it is very unlikely that an unspecified player somewhere doesn’t have an inordinately lengthy lucky streak. This is a manifestation of the mechanism called regression to the mean….in real life, the larger the deviation from the norm, the larger the probability of it coming from luck rather than skills…This can be easily verified in stories of very prominent people in trading rapidly reverting to obscurity, like the heroes I used to watch in trading rooms. — Nassim Taleb
Correlation & Regression
Regression to the mean occurs whenever a nonrandom sample is selected from a population and two imperfectly correlated variables are measured. The less correlated the two variables, the larger the effect of regression to the mean. The more extreme the value from the population mean, the more room there is to regress to the mean.
Galton figured out that correlation and regression are not two concepts and they are different perspectives of the same concept. The general rule is whenever the correlation between two scores is imperfect, there will be regression to the mean. More explanation from Wikipedia:
Galton showed that the height of children from very short or very tall parents would move towards the average. In fact, in any situation where two variables are less than perfectly correlated, an exceptional score on one variable may not be matched by an equally exceptional score on the other variable. The imperfect correlation between parents and children (height is not entirely heritable) means that the distribution of heights of their children will be centered somewhere between the average of the parents and the average of the population as whole. Thus, any single child can be more extreme than the parents, but the odds are against it.