From AQR's Cliff's Perspectives blog:
It’s Not Data Mining — Not Even Close
Data mining — “discovering” historical patterns that are driven by random, not real, relationships and assuming they’ll repeat — is a huge concern in many fields. My focus is, of course, on the field of investing, where those concerns are particularly present. That is true in academic and quantitative studies when great statistical power is brought to the effort, but it’s also a concern in the non-quant world (how many would want to imitate Warren Buffett if he had not been so successful and do we give too much weight to that ex post result?). Some critics of the basic findings in quantitative finance — here I refer to the success of the small-cap, value and momentum factors — focus on this problem of data mining. They vary from the sober, helpful and important, to the less so....MORE
One early critic of these results, based on fears of data mining, was Fischer Black.[1] I disagreed with him at the time (in fact, you can find me listed in the thank-yous in his paper[2]), but his worry about these specific factors was inherently more reasonable in 1990, when many of the results were “in sample.” This will be a very short post as all I’m going to do is look at the out-of-sample results since Fischer’s worry (also roughly the out-of-sample results I’ve experienced since my dissertation studying value and momentum — it’s fun to have been around long enough to have a personal out-of-sample period!).[3]
Our most potent weapon in addressing data mining is the out-of-sample test.[4],[5] If a researcher discovered an empirical result only because she tortured the data until it confessed, one would not expect it to work outside the torture zone. Since the initial papers of Fama and French (1992, 1993), the results for value, momentum and size[6] have been tested out-of-sample in other places besides U.S. equities, where they were initially uncovered. Back then and more recently we found strong empirical evidence for these concepts — particularly value and momentum — in other contexts, geographies and asset classes, providing strong support for the basic factors’ efficacy. Subsequent research (for example here and here extended some of the basics further back in time, another out-of-sample test if you hadn’t looked yet. But, there is probably no substitute for simply looking at how the actual first factors for U.S. equities, constructed very simply and in a highly similar fashion to how they were back then, have performed out-of-sample since their initial publication.
I look at just three factors, SMB (Fama-French’s construct measuring the return spread of small versus big stocks), HML[7] (Fama-French’s construct measuring the return spread of low versus high price-to-book stocks, or as others might put it, the spread between cheap and expensive stocks), and UMD (Fama-French’s version of the momentum factor measuring the return spread of past winner versus loser stocks), over what I label the “in-sample” periods (both July 1963 to December 1991 and January 1927 to December 1991) and the “out-of-sample” period (January 1992 to March 2015).[8]
HT: Levine @ Bloomberg