Chris Anderson, editor-in-chief at Wired, is generating
quite a bit of buzz with his article, "The End of Theory: The Data
Deluge Makes the Scientific Theory Obsolete." The controversial article
suggests that in the "Age of the Petabyte," scientific theory is
becoming outdated. He cites Google's search engines as the example par excellence
in supporting his claim that "Correlation supersedes causation, and
science can advance even without
coherent models, unified theories, or really any mechanistic
explanation at all." Anderson suggests that mining through gargantuan
amounts of data will produce as many new discoveries and insights as
years of scientific research.
There is no denying the importance of new data
technology, but Anderson fails to recognize that data mining cannot
replace science. Data collection is only one step of the scientific method:
observation. He points to the work of biologist, J. Craig Venter, who
uses supercomputers to sequence entire ecosystems. In doing so, he has
discovered thousands of previously unknown species of bacteria and
other life-forms. But what is the point of knowing these species are
out there (we're already aware of that) if we don't know anything about
them? As one blog post points out, with Venter's data we can only make
a few guesses about the properties of the organisms based on who their
relatives are -- an activity that requires a little scientific theory
called evolution.
Data mining can really
only point us in the right direction of new discovery by showing us
relationships between data points; it can't generate new discoveries
alone. Anderson quickly throws out every theory of human behavior: "Who
knows why
people do what they do? The point is they do it, and we can track and
measure it with unprecedented fidelity. With enough data the numbers
speak for themselves." Call me old-fashioned (Anderson probably
would), but to me, the "what" doesn't really matter without the "why." Stripped of its context, a number is just a factoid, a small puzzle
piece without the larger picture.
Without
science, data is no better than babel. While data can lead to new levels
of understanding, Anderson's theory misses the point of the study of
science: to intelligently understand the natural world. Anderson may be
too busy praising the Google gods to take note of the possibility of
the semantic web, where the goal is more than crunching data -- it's
understanding information. Although data mining may
change the rules of the science game, it's definitely not the end of
theory.
As we transition to the Age of the Petabyte, I don't see new
technology leaving scientific theory in the dust. Rather, theory will
be alive and kicking, as technology and science continue to evolve
side-by-side. And with the unwavering certainty of the clock as it
ticks
and tocks on, new rules will become old ones, and one day even data
mining will be replaced by a shiny new method of generating information.
Recent Comments