Saturday, January 17, 2026

"Our Obsession With Statistical Significance Is Ruining Science"

Personally, I think any statistical report should show error bars, meaning "This is not The Truth, just our best guess" and if those error bars don't encompass a result later shown to be incorrect, the statistician and anyone else promulgating the erroneous figures should be flogged.

But maybe that's just me.

From Reason Magazine, December 1:

A forgotten Guinness brewer's alternative approach could have prevented 100 years of mistakes in medicine, economics, and more. 

A century ago, two oddly domestic puzzles helped set the rules for what modern science treats as "real": a Guinness brewer charged with quality control and a British lady insisting she can taste whether milk or tea was poured first.

Those stories sound quaint, but the machinery they inspired now decides which findings get published, promoted, and believed—and which get waved away as "not significant." Instead of recognizing the limitations of statistical significance, fields including economics and medicine ossified around it, with dire consequences for science. In the 21st century, an obsession with statistical significance led to overprescription of both antidepressant drugs and a headache remedy with lethal side effects. There was another path we could have taken.

Sir Ronald Fisher succeeded 100 years ago in making statistical significance central to scientific investigation. Some scientists have argued for decades that blindly following his approach has led the scientific method down the wrong path. Today, statistical significance has brought many branches of science to a crisis of false-positive findings and bias.

At the beginning of the 20th century, the young science of statistics was blooming. One of the key innovations at this time was small-sample statistics—a toolkit for working with data that contain only a small number of observations. That method was championed by the great data scientist William S. Gosset. His ideas were largely ignored in favor of Fisher's, and our ability to reach accurate and useful conclusions from data was harmed. It's time to revive Gosset's approach to experimentation and estimation.

Fisher's approach, "statistical significance," is a simple method for drawing conclusions from data. Researchers gather data to test a hypothesis. They compute the p-value under the null hypothesis—that's the probability of observing their data if the effect they are testing is absent. They compare that p-value to a cutoff, usually 0.05. If the p-value is below the cutoff—in other words, the data we observe are unlikely under the null hypothesis—then the effect is present.

Fisher pioneered many statistical tools still in use today. But writing in the early 20th century, when science was carried out with fountain pens and slide rules, he could not have anticipated how those tools would be misused in an era of big data and limitless computing power.

Fisher was able to attend Cambridge only by virtue of winning a scholarship in mathematics. In 1919 he was offered a job as a statistician at the Rothamsted Agricultural Experiment Station, the oldest scientific research farm in England. At Rothamsted, then at University College London and Cambridge, Fisher grew into an awesomely productive polymath. He invented p-values, significance testing, maximum likelihood estimation, analysis of variance, and even linkage analysis in genetics. The Simply Statistics blog estimates that if every paper that used a Fisherian tool cited him by 2012 he would have amassed over 6 million citations, making him the most influential scientist ever.

In 1925 Fisher published his first textbook, Statistical Methods for Research Workers, which defined the field of statistics for much of the 20th century. Before its publication, a researcher who wished to draw conclusions from data would make use of a large-sample formula such as the normal distribution. Discovered a hundred years earlier by Carl Friedrich Gauss, the normal distribution is the standard "bell curve" formula for an entire population of observations around a central mean. Fisher's textbook provided tools to analyze more limited samples of data.

How Beer Led to a Breakthrough

The most important small-sample formula in Statistical Methods was discovered by a correspondent of Fisher's, William Sealy Gosset, who studied mathematics and chemistry at Oxford. Instead of staying in academia, Gosset moved to Dublin to work for Guinness.

Guinness was expanding rapidly and shipping its product worldwide. By 1900 it was the largest beer maker in the world, producing over a million gallons of stout porter each year. At such scale, it no longer made sense to test each batch by taste and feel. Guinness set up an experimental lab within its giant brewery in St. James's Gate, Dublin, to systematically improve quality and yield, and staffed it with talented young university graduates. Gosset and his colleagues were among the first industrial data scientists.

Gosset's interest in small-sample statistics flowed from his everyday work. Beer takes three ingredients: yeast, hops, and grain. The grain in Guinness is malted barley. To prepare the barley, you steep it in water, allow it to germinate, and then dry and roast it, which malts the starch into sugar that the yeast can digest. The amount of sugar in a batch of malt affects the taste of the beer, its shelf life, and its alcohol content, and was measured at that time in "degrees saccharine."

Guinness had established that 133 degrees saccharine per barrel was the ideal level, and was willing to tolerate a margin of error of 0.5 degrees on either side. The brewer could take spoonfuls from a barrel of malt, test each spoonful, then take the average. But how accurate would that average be—should he take five spoonfuls or 10? Gosset verified that small-sample estimates are more spread out than a normal distribution, because you might draw a spoonful that is unusually high or unusually low in sugar, and in a small sample such outliers will have outsize influence....

....MUCH MORE 

Previously:

August 2018 - UPDATED—"How Beer Revolutionized Math — and Just Might Save Humanity "