Saturday, October 29, 2016

arXive: "What Counts as Science?"

We like arXive.
A lot.*

The arXiv preprint service is trying to answer an age-old question. The address was cryptic, with a tantalizing whiff of government secrets, or worse.
The server itself was exactly the opposite. Government, yes—it was hosted by Los Alamos National Laboratory—but openly accessible in a way that, in those early Internet days of the 1990s, was totally new, and is still game-changing today.
The site, known as arXiv (pronounced “archive,” and long since decamped to the more wholesome address “” and to the stewardship of the Cornell University Library), is a vast repository of scientific preprints, articles that haven’t yet gone through the peer-review process or aren’t intended for publication in refereed journals. (Papers can also appear, often in revised form, after they have been published elsewhere.) As of July 2016, there were more than a million papers on arXiv, leaning heavily toward the hardest of the hard sciences: math, computer science, quantitative biology, quantitative finance, statistics, and above all, physics.
ArXiv is the kind of library that, 30 years ago, scientists could only dream of: totally searchable, accessible from anywhere, free to publish to and read from, and containing basically everything in the field that’s worth reading. At this golden moment in technological history, when you can look up the history of atomic theory on Wikipedia while waiting in line at Starbucks, this might seem trivial. But in fact it was revolutionary.

Practically, arXiv has leveraged new technologies to create a boon for its community. What is less visible, though, is that it has had to answer a difficult philosophical question, one which resonates through the rest of the scientific community: What, exactly, is worth reading? What counts as science?
Before arXiv, preprint papers were available only within small scientific circles, distributed by hand and by mail, and the journals in which they were ultimately published months later (if they were published at all) were holed up in university libraries. But arXiv has democratized the playing field, giving scientists instant access to ideas from all kinds of colleagues, all over the world, from prestigious chairs at elite universities to post-docs drudging away at off-brand institutions and scientists in developing countries with meager research support.
Paul Ginsparg set up arXiv in 1991, when he was a 35-year-old physicist at Los Alamos. He expected only about 100 papers to go out to a few hundred email subscribers in the first year. But by the summer of 1992, more than 1,200 papers had been submitted. It was a good problem to have, but still a problem. While Ginsparg had no intention of giving incoming papers the top-to-tail scrutiny of peer review, he did want to be sure that readers could find the ones they were most interested in. So he started binning the incoming papers into new categories and sub-categories and bringing on more and more moderators, who took on the work as volunteers, as a service to their scientific community.
Will unclassifiable papers get lost in the muck of the truly incoherent?
The arXiv credo is that papers should be “of interest, relevance, and value” to the scientific disciplines that arXiv serves. But as the site and its public profile grew, it began attracting papers from outside the usual research circles, and many of those papers didn’t pass the test. They weren’t necessarily bad science, says Ginsparg. Bad science can be examined, tested, and refuted. They were “non-science”—sweeping theories grandly claiming to overturn Einstein, Newton, and Hawking; to reveal hidden connections between physics and ESP and UFOs; and to do it all almost entirely without math and experiment.

The arXiv’s default stance is acceptance—papers are innocent “until proven guilty,” Ginsparg says—but the non-science papers were a waste of scholarly readers’ time. And if they were allowed to share the same virtual shelf space with legitimate science, they could create confusion among arXiv’s growing audience of journalists and policymakers. So, paper by paper, moderators had to make the call: What is and isn’t science?...MORE
It just struck me that I missed a chance at a "Questions America Wants Answered" headline with the pull-quote: "Will unclassifiable papers get lost in the muck of the truly incoherent?"

*Here is a quick search of the blog for arXive and some posts sourced to same:

Computer Simulations Reveal Benefits of Random Investment Strategies Over Traditional Ones
In the Case of Airbnb, Uber et al: Time To Break Out The Algorithmic Regulation
So You Think Your Computer Is Fast? Update on Google, D-Wave and the Quantum Computer
Sorry Fact Checkers, The Robots Are Coming For Your Jobs
"Two centuries of trend following"
Making A Better Model of the Market: Are Financial Markets an Aspect of the Quantum World?
"Leverage effect in energy futures"
Here Is The Paper "How Unlucky Is 25 Sigma?" (25 Standard Deviation Moves Basically Don't Happen)
"The Interrupted Power Law and The Size of Shadow Banking"
"The Illusion of the Perpetual Money Machine"
"How A Private Data Market Could Ruin Facebook" (FB)

And, if I haven't yet made the point, here's 2012's "Super Physics Smackdown: Relativity v Quantum Mechanics...In Space":
Some of the more interesting finance papers of the last few years have ended up at the arXive, we are fans.
From the Physics arXiv blog:...
Use the 'search blog' link if interested in more.