"With Big Data, we are creating artificial intelligences that no human can understand"
From Quartz:
The basis for algorithm's predictions may be beyond understanding for the average human
Excerpted from BIG DATA: A Revolution That Will Transform How We Live, Work, and Think by Viktor Mayer-Schönberger, Kenneth Cukier.
Computer
systems currently base their decisions on rules they have been
explicitly programmed to follow. Thus when a decision goes awry, as is
inevitable from time to time, we can go back and figure out why the
computer made it. For example, we can investigate questions like “Why
did the autopilot system pitch the plane five degrees higher when an
external sensor detected a sudden surge in humidity?” Today’s computer
code can be opened and inspected, and those who know how to interpret it
can trace and comprehend the basis for its decisions, no matter how
complex.
With big-data analysis, however, this traceability will
become much harder. The basis of an algorithm’s predictions may often be
far too intricate for the average human to understand.
When
computers were explicitly programmed to follow sets of instructions, as
with IBM’s early translation program of Russian to English in 1954, a
human could readily grasp why the software substituted one word for
another. But Google Translate incorporates billions of pages of
translations into its judgments about whether the English word “light”
should be “lumière” or “léger” in French (that is, whether the word
refers to brightness or to weight). It’s impossible for a human to trace
the precise reasons for the program’s word choices because they are
based on massive amounts of data and vast statistical computations.
Big
data operates at a scale that transcends our ordinary understanding.
For example, the correlation Google identified between a handful of
search terms and the flu was the result of testing 450 million
mathematical models. In contrast, Cynthia Rudin initially designed 106
predictors for whether a manhole might catch fire, and she could explain
to Con Edison’s managers why her program prioritized inspection sites
as it did. “Explainability,” as it is called in artificial intelligence
circles, is important for us mortals, who tend to want to know why, not
just what. But what if instead of 106 predictors, the system
automatically generated a whopping 601 predictors, the vast majority of
which had very low weightings but which, when taken together, improved
the model’s accuracy? The basis for any prediction might be staggeringly
complex. What could she tell the managers then to convince them to
reallocate their limited budget?...MORE