Everything is Data Mining
... and so are you.
Fama and French opened the flood-gates for factor research. Academics rushed to discover and publish increasingly esoteric and often overlapping factors. At last count, there were over 400 factors published in various academic journals. However, most of them don’t add much value. As an investor, how do you navigate through this mess?
One thing you can do is take all the published factors and put them through a significance test and see what passes. For example, the paper Taming the Factor Zoo (aqr), uses a double-selection LASSO to filter out factors.
You could also go about iteratively adding factors to CAPM until all remaining alphas are rendered insignificant. Factor Zoo (.zip) did this and found that about 15 factors are enough to span the entire factor zoo (SSRN).
However, the larger question is whether data mining for factors bad? As an investor, it is only bad if its post-sample performance decays precipitously. Otherwise, who cares what a factor’s origin story is? Some claim it is and suggest that only factors backed by “theory” are the “real” ones.
the best hope for finding pricing factors that are robust out of sample... ...is to try to understand the fundamental macroeconomic sources of risk.
Cochrane (2009). Asset pricing: Revised edition.
An interesting paper came out this week — Does peer-reviewed theory help predict the cross-section of stock returns? Andrew Y. Chen, Alejandro Lopez-Lira, and Tom Zimmermann (pdf) — that questions whether theory protects against data mining bias and post-sample decay at all.
Our main finding is that the post-sample return depends little on the origins of predictability. Regardless of whether a predictor has a publishable risk-based explanation, a mispricing explanation, or is publishable without a clear explanation, the post-sample return is about 50% smaller than the in-sample return. We strongly reject the hypothesis that risk-based theory prevents post-sample decay.
It does not even matter if the predictor is purely data mined.
It’s as if the finance academics are just mining accounting data for return predictability, and then decorating the results with stories about risk and psychology.
If a factor’s origin story doesn’t matter, then its post-sample performance would depend on how widely it is embraced by other investors. If it is counterintuitive or deemed too silly or risky, then chances are that it will retain most of its original alpha.
So, go ahead, mine that data!
Markets this Week
Long bonds and Gold on a tear…
Common Risk Factors in the Cross-Section of Corporate Bond Returns
Overlapping Momentum Portfolios
We show that stocks in the intersection of the 6 and 12-month momentum portfolios—"overlapping'' momentum stocks—display enhanced medium-term return momentum. Focusing on overlapping momentum stocks improves the returns of several momentum-based strategies proposed in the literature. The results are in line with the concurrence of trades by heterogeneous momentum investors exacerbating return continuation.
Investing & Economy
SEBI plans to lower capital and disclosure requirements for fund houses that run passive investment schemes. (reuters)
Large shareholders (“promoters”) in Indian firms have sold $12.36 billion in shares so far this year, double that of in 2022, as an ongoing rally in the domestic equity markets help them offload shares at the fastest pace on record. (reuters)
Foreign inflows into Indian government bonds hit the highest level in six years. (reuters)
India’s Gross Domestic Product — the measure of economic output — had grown by 7.6% in Q2. (indianexpress)
Attrition rates hitting 35-50% in India’s BFSI sector. (reuters)
Inside Foxconn’s struggle to make iPhones in India (restofworld)
Market pricing has grown more aggressive on Fed policy easing, with fed funds futures now pointing to five quarter-percentage-point rate cuts next year (cnbc). Traders in federal funds futures markets now see about a two-thirds chance of the Fed reducing rates as early as March 2024, up from about 20% a week ago (ft).
Growth in the US economy continues to surprise to the upside while inflation declines. (yahoo)
Emerging market equities could rally in 2024 if Wall Street's new narrative of lower U.S. interest rates comes true. (ibtimes)
The world will be back to ZIRP in no time.
Global R* fell by more than 3% from its peak in the mid-1970s, driven by falling productivity growth and increased longevity. Without a reversal in these trends, or new forces emerging to offset them, long-run Global R* appears likely to remain low. (BOE)
Remote collaboration fuses fewer breakthrough ideas (nature). Meanwhile, Return to the Office is dead.
Many retail investors, who try their luck with market timing or security selection, end up spending countless hours doing research only to underperform the market anyways. Their mistake is believing that the primary problem in investing is what to buy or when to buy, rather than how to stay invested for the long-term. (ofdollarsanddata)
There’s new evidence that market timing doesn’t work. Your odds of success are better if you just hang on and aim for average returns. (nytimes)
Want to keep your house? Support your kids? Stay alive? Never stop working. (thewalrus)
Xi Jinping’s grip on Chinese enterprise gets uncomfortably tight (economist)
Silicon Valley’s worldview is not just an ideology; it’s a personality disorder. (crookedtimber)
Richest 1% account for more carbon emissions than poorest 66% (theguardian)
More than 20 nations including the United States called for a tripling of nuclear energy to drive down emissions at the COP28 conference. (phys)
EVs are less reliable than conventional vehicles (msn)
Odds & Ends
A Google AI has discovered 2.2m materials unknown to science (economist)
The "AI debate" is pretty stupid, proceeding as it does from the foregone conclusion that adding compute power and data to the next-word-predictor program will eventually create a conscious being, which will then inevitably become a superbeing. This is a proposition akin to the idea that if we keep breeding faster and faster horses, we'll get a locomotive.