Thursday, February 21, 2013

"The backlash against Big Data has started"

From Numbers Rule Your World:
It is inevitable that all the hype around "Big Data" leads to a backlash. As someone who's been working in "data science" before the term existed, I am happy to see widespread validation of the field but also concerned about over-promise and under-deliver. Several recent articles went overboard in criticizing data science -- while their points are sometimes valid, the tone of these pieces misses the mark. I'll discuss one of these articles in this post, and some others in the next few days.
Andrew Gelman has a beef with David Brooks over his New York Times column called "What Data Can't Do". (link) I will get to Brooks's critique soon--my overall feeling is, he created a bunch of sound bites, and could have benefited from interviewing people like Andrew and myself, who are skeptical of Big Data claims but not maniacally dismissive.

The biggest issue with Brooks's column is the incessant use of the flawed man versus machine dichotomy. He warns: "It's foolish to swap the amazing machine in your skull for the crude machine on your desk." The machine he has in his mind is the science-fictional, self-sufficient, intelligent computer, as opposed to the algorithmic, dumb-and-dumber computer as it exists today and for the last many decades. A more appropriate analogy of today's computer (and of the foreseeable future) is a machine that the human brain creates to automate mechanical, repetitious tasks at scale. This machine cannot function without human piloting so it's man versus man-plus-machine, not man versus machine.

I use such an analogy in Chapter 2 of Numbers Rule Your World, to compare and contrast the credit-scoring algorithmic paradigm with the manual underwriting paradigm of the past. The point is that there is more similarity than difference between the automated and the manual methods; the automated methods are faster, better able to handle multiple threads, and unfazed by individual bias.
A major blind spot is ignoring the work of Kahneman and Tversky, and other behavioral psychologists, who have shown convincingly that the human brain is subject to all kinds of biases, and uses heuristics that lead to incorrect judgements....MORE