The other day I was working with one of our developers to check the accuracy of his new correlation calculation. I gave him a list of stocks and he gave me the correlation against the S&P 500 as calculated by his code. I downloaded data from Yahoo Finance and calculated correlation in Excel and got meaningfully different results for some stocks but not others. We quickly figured out his code was using one year of weekly data and mine was 3 years of weekly data, but I still found it odd that some matched and some didn’t. He changed his code to three years to match my calculation and the results were much closer but some were still fairly different. After some digging, we found the reason was he was using Mondays and I was using Wednesdays. How does such a minor change like day of week change the correlation? More importantly, what does that say about the stability, and thus, the value of the correlation calculation?*
Correlation can be a powerful portfolio management tool. But it, like other statistical measures, can be misleading when taken at face value. The problem with correlation is twofold:
1. Causality: Correlation can be non-causal. A highly publicized example is the predictive power of the Super Bowl champ in determining the party that wins the presidential election (60% correlation from ’80-’08). There is clearly no causality between them but a significant amount of correlation.
2. Stability: Correlation can be unstable between time periods. The correlation between two stocks may not be the same for 2008 as it was for 2012. If you use correlation to guide your portfolio management, which correlation you choose can have a large bearing on your decision.
My hypothesis was that a more meaningful correlation (and thus a better Beta) could be built by measuring the stability of correlation. The idea was to measure correlation over various time periods and determine the standard deviation like this:
1. Build multiple correlation matrices using different time periods (3 Month, 1 Year, 3 Year, 5 Year, 10 Year with random start dates for each)
2. Calculate a standard deviation of correlations to measure how deterministic the correlation is between each asset in the correlation matrix
3. If the standard deviation is low then the correlation measure is meaningful (causal), if the standard deviation is high then the correlation is not dependable (non-causal)
After showing this to Benn Dunn, Alpha Theory’s uber-intelligent Head of Risk Consulting, he said without hesitating, “oh, you should look at Dickey-Fuller and Cointegration.” Eureka! The stability of correlation was a statistical science in itself. There were various methods to measure variable stability, but the goal of each was to determine if the relationship of one period would hold in other periods. If so, like oil prices and energy stocks, we’ve got a relationship we can model. If not, like presidential races and super bowl champs, the correlation should be given less authority.
So where does this leave us? I believe there are two major takeaways. One, if the period chosen to calculate betas and correlation has a large bearing on the results then the betas and correlations used in the model probably can’t be relied upon for predictive purposes. Portfolio management is garbage in-garbage out. If you can determine something is garbage, don’t put it in, or you’ll guarantee garbage out. Two, a portfolio’s dominant input should be the risk-reward (expected return) derived from a thoughtful, critically evaluated research process. Constraints on known causal relationships, like no more than 20% exposure to a sector, can be factored in to reduce correlation. Business risk constraints like maximum position size or liquidity adjusted position sizes are also important. To me, the lack of correlation stability is just another nail in the coffin of mean-variance optimization. Focus on the fundamentals and use a portfolio management philosophy, like Alpha Theory, that allows you to build the portfolio around the research of the firm and reasonable constraints.
*Correlation is used to calculate Beta so if the correlation is meaningless then so is the Beta. See our Blog: Building a Better Beta