
Visit Massimo's
Skeptic & Humanist Web
Quote of the
month:
"A
scientific truth does not triumph by convincing its opponents and making them see the
light, but rather because its opponents eventually die and a new generation grows up that
is familiar with it."
Maxwell
Planck
Further
reading:
Bayes or Bust?,
by John Earman. Earman (a professor of History and Philosophy of Science at the University
of Pittsburgh) argues that Bayesianism provides the best hope for a comprehensive and
unified account of scientific inference, yet the presently available versions of
Bayesianism fail to do justice to several aspects of the testing and confirming of
scientific theories and hypotheses. By focusing on the need for a resolution to this
impasse, Earman sharpens the issues on which a resolution turns.
Web
links:
A collection of Bayesian sites
to find software, theory, and discussions.
A slide show providing an introduction to
Bayesian statistics.
A Bayesian statistics reading list.

Visit Massimo's Philosophy Page

Massimo's Phenotypic Plasticity:
Beyond Nature and Nurture

Massimo's Phenotypic Evolution
(with Carl Schlichting)
|
How does science work,
really? You can read all about it in plenty of texts in philosophy of science, but if you
have ever experienced the making of science on an everyday basis, chances are you will
feel dissatisfied with the airtight account given by philosophers. Too neat, not enough
mess. To be sure, I am not denying the existence of the
scientific method(s), as radical philosopher Paul Feyerabend is infamously known for
having done. But I know from personal experience that scientists dont spend their
time trying to falsify hypotheses, as Karl Popper wished they did. By the same token,
while occasionally particular scientific fields do undergo periods of upheaval, Thomas
Kuhns distinction between normal science and scientific
revolutions is too simple. Was the neo-Darwinian synthesis of the 1930s and
40s in evolutionary biology a revolution or just a significant adjustment? Was Eldredge
and Goulds theory of punctuated equilibria to explain certain features
of the fossil record a blip on the screen or, at least, a minor revolution?
But, perhaps, the least convincing feature of the scientific method
is not something theorized by philosophers, but something actually practiced by almost
every scientist, especially those involved in heavily statistical disciplines such as
organismal biology and the social sciences. Whenever we run an experiment, we analyze the
data in a way to verify if the so-called null hypothesis has been successfully
rejected. If so, we open a bottle of champagne and proceed to write up the results to
place a new small brick in the edifice of knowledge.
Let me explain. A null hypothesis is what would happen if nothing
happened. Suppose you are testing the effect of a new drug on the remission of breast
cancer. Your null hypothesis is that the drug has no effect: within a properly controlled
experimental population, the subjects receiving the drug do not show a statistically
significant difference in their remission rate when compared to those who did not receive
the drug. If you can reject the null, this is great news: the drug is working, and you
have made a potentially important contribution toward bettering humanitys welfare.
Or have you?
The problem is that the whole idea of a null hypothesis, introduced
in statistics by none other than Sir Ronald Fisher (the father of much modern statistical
analyses), constraints our questions to yes and no answers. Nature
is much too subtle for that. We probably had a pretty good idea, before we even started
the experiment, that the null hypothesis was going to be rejected. After all, surely we
dont embark in costly (both in terms of material resources and of human potential)
experiments just on the whim of the moment. We dont randomly test all possible
chemical substances for their role as potential anti-carcinogens. What we really want to
know is if the new drug performed better than other, already known, onesand by how
much. That is, every time we run an experiment we have two factors that Fisherian (also
known as frequentist, see below) statistics does not take into account: first,
we have a priori expectations about the outcome of the experiments, i.e., we dont
enter the trial as a blank slate (contrary to what is assumed by most statistical tests);
second, we normally compare more than two hypotheses (often several), and the least
interesting of them is the null one.
An increasing number of statisticians and scientists are beginning
to realize this, and are ironically turning to a solution that was devises, and widely
used, well before Fisher. That solution was contained in an obscure paper that one
Reverend Thomas Bayes published back in 1763, and is revolutionizing how scientists do
their work, as well as how philosophers think about science.
Bayesian statistics simply acknowledges that what we are really
after is an estimate of the probability of a certain hypothesis to be true, given what we
know before running an experiment, as well as what we learn from the experiment itself.
Indeed, a simple formula known as Bayes theorem says that the probability that a
hypothesis (among many) is correct, given the available data, depends on the probability
that the data would be observed if that hypothesis were true, multiplied by the a priori
probability (i.e., based on previous experience) that the hypothesis is true.
In Fisherian terms, the probability of an event is the frequency
with which that event would occur given certain circumstances (hence the term
frequentist to identify this classical approach). For example, the probability
of rolling a three with one (unloaded) die is 1/6, because there are six possible,
equiprobable outcomes, and on average (i.e., on long enough runs) you will get a three one
time every six.
In Bayesian terms, however, a probability is really an estimate of
the degree of belief (as in confidence, not blind faith) that a researcher can put into a
particular hypothesis, given all she knows about the problem at hand. Your degree of
belief that threes come out once every six rolls of the die comes from both a priori
considerations about fair dice, and the empirical fact that you have observed this sort of
events in the past. However, should you witness a repeated specified outcome over and
over, your degree of belief in the hypothesis of a fair die would keep going down until
you strongly suspect foul play. It makes intuitive sense that the degree of confidence in
a hypothesis changes with the available evidence, and one can think of different
scientific hypotheses as competing for the highest degree of Bayesian probability. New
experiments will lower our confidence in some hypotheses, and increase the one in others.
Importantly, we might never be able to settle on one final hypothesis, because the data
may be roughly equally compatible with several alternatives (a frustrating situation very
familiar to any scientist and known in philosophy as the underdetermination of hypotheses
by the data).
You can see why a Bayesian description of the scientific enterprise
while not devoid of problems and critics is revealing itself to be a
tantalizing tool for both scientists, in their everyday practice, and for philosophers, as
a more realistic way of thinking about science as a process.
Perhaps more importantly, Bayesian analyses are allowing researchers
to save money and human lives during clinical trials because they permit the researcher to
constantly re-evaluate the likelihood of different hypotheses during the experiment. If we
dont have to wait for a long and costly clinical trial to be over before realizing
that, say, two of the six drugs being tested are, in fact, significantly better than the
others, Reverend Bayes might turn out to be a much more important figure in science than
anybody has imagined over the last two centuries.
Next month: "Is philosophy
useless?"
Previous
Columns Archive
© by Massimo Pigliucci, 2002
Many thanks to Melissa
Brenneman and Bob Faulkner for patiently editing and commenting on Rationally Speaking
columns. |