In our last posting session we discussed about statistical modelling. Briefly, it was said that the main idea of
statistical models relies in the quali-quantitative description of reality. If
one assumes that the data sampling and modelling process is surrounded by
several sources of uncertainty, it turns out that summarizing these
uncertainties into probabilities is the most practical form to achieve their
quantification.

Indeed, within Statistics (which may be described as the mathematics of
uncertainty) the probability theory is a keystone, because through it one can
quantitatively accomplish all inference about the parameters of a tested
model. Nevertheless, adopting a certain
concept of probability has not always been such straightforward – Yes, there
are different concepts of how to define probability and this has led to an
endless debate about whether one is better than another.

Currently, the branch of statistical inference can be divided in two main
strands: the frequentist (or classical) and the Bayesian inference. Here, it is
not our intention to deepen philosophical discussion on both approaches.
Though, we will shortly outline the main differences among them, and also give
some reasons why we are inclined in using Bayesian inference in most of our
studies. So, for the sake of simplicity, let’s introduce a quick way to
contrast both inferences by exposing the impressive archery skills of Merida (from the Disney animation).

What can we learn by reading these short comics? Well, firstly within the frequentist inference, the probability is defined as long-run relative frequencies of events, and for that one has to assume a purely random and well-defined experiment. For instance, in the comics the experiment would be composed by Merida shooting randomly arches at the target point (bullseye), and by you, who sits behind the target point and tries to calculate the exact position of the bullseye.

*Figure 1: Comics contrasting frequentist and Bayesian inference. Adapted from http://faculty.washington.edu/kenrice/BayesIntroClassEpi515.pdf*

**The frequentist way of thinking**

What can we learn by reading these short comics? Well, firstly within the frequentist inference, the probability is defined as long-run relative frequencies of events, and for that one has to assume a purely random and well-defined experiment. For instance, in the comics the experiment would be composed by Merida shooting randomly arches at the target point (bullseye), and by you, who sits behind the target point and tries to calculate the exact position of the bullseye.

In every shoot, you draw a 10 cm circle around the arch and assume that
this circle represents 95% of your confidence that it includes the bullseye. If
Merida continues shooting randomly some arches (e.g. 100 arches) and you would
continue drawing 10 cm circle for each shot arch, then at the end you would
perceive that 95 of your circles have some degree of overlap and the remaining
5 felt completely out of the target point. What would then be your conclusions?
Well, you would assume that the bullseye is contained somewhere within the 95
overlapped circles, being therefore 95% confident about your estimation since
95 of 100 shoots felt very close to each other.

Note, however, that to estimate such outcome, Merida and you would have to repeat this experiment over and over, each of them being a new and independent experiment in relation to the previous one. In addition, although you would have 95% confidence about your estimation, it does not tell you the exact position of the bullseye,

Note, however, that to estimate such outcome, Merida and you would have to repeat this experiment over and over, each of them being a new and independent experiment in relation to the previous one. In addition, although you would have 95% confidence about your estimation, it does not tell you the exact position of the bullseye,

*i.e.*, is it truly in the 95% region or could it also be within the remaining 5% region? So, what about Bayesian inference?

**The Bayesian way of thinking**

Well, under the Bayesian prism the probability is defined on an
individual’s degree of belief of a particular event. Specifically, the
probability quantifies the plausibility attributed to a certain proposition and
whose truthfulness is uncertain in the light of available knowledge (Kinas
& Andrade, 2010).

Let’s go back to the comics example: although you are sitting behind the
target point, you may have some knowledge about the bullseye location due to
previous experience. This experience can be attributed, for example, to the
fact that you have already done the same experiment with other archers and thus
may have an overall idea where the most probable target point could be, or also
because you know Merida’s extraordinary abilities so well that that her first
shot will probably lay very close to the bullseye (if not already over the
target!).

The fact is that within Bayesian inference, you can use your previous experience and update your estimation as more and more arches are shot. Thus, a new experiment (

The fact is that within Bayesian inference, you can use your previous experience and update your estimation as more and more arches are shot. Thus, a new experiment (

*i.e*., a new shoot) is conditioned to the results of the previous experiment, and thus you will be able to continuously update your estimates until you reach the most probable outcome. As Kruschke (2014) already said,*‘’Bayesian inference is reallocation of credibility across possibilities.’’*

**So…why Bayesianism instead of Frequentism?**

Most of the criticism over Bayesian inference is upon its subjective definition
of probability. Such critiques have mainly increased after the 20’s, when Karl Pearson and Ronald Fisher firstly developed statistics as an information science. Nevertheless,
this subjectivity is the cornerstone of Bayesian inference and represents a
consequence of available information, and not merely an arbitrary
quantification as usually thought. Moreover, the scientific judgment about the
best choice of these probabilities is as necessary as the decision made upon
choosing the most appropriate model according to tested data (Kinas &
Andrade, 2010).

In conceptual terms, Bayesian inference is way much simpler than
frequentist inference if we consider that all questions can be answered through
the analysis of the posterior distribution, and which is obtained by means of Bayes’ Theorem (also known as inverse probability theorem)[1]. In the light of this, it can be noticed that
the posterior distribution denotes the most complete way to express the state
of knowledge about an investigated phenomenon.

*Figure 2: Thomas Bayes (upper left panel), Pierre S. Laplace (upper right panel), Karl Pearson (lower left panel), and Ronald Fisher (lower right panel).*
The essence of Bayesian inference lies precisely in the interactive
dynamism between the previous experiences (also known as

*priors*in Bayesian statistical terminology) and current experiment (known under likelihood), which jointly reallocates the credibility denoted in the posterior distribution (from which all necessary inferences are drawn). This also shows that the today’s posterior distribution can become the prior distribution of tomorrow (remember the comics), a fact that can never be assumed in the context of classical inference since all experiments are independent from each other.
Furthermore, according to Jaynes (2003), it is also at this stage where
one of the major differences among both inferential branches arises, because
probabilities change as we change our state of knowledge; frequencies, by
contrast, do not change. This also explains why particular questions cannot be
answered according to the frequency definition of probability. In environmental
sciences, for example, questions such as ‘’

*what is the probability that the current water use will remain sustainable*’’ or ‘’*what is the probability that area A presents greater potential for conservation than area B*’’ can only be answered under the Bayesian prism.
Thus, knowing that uncertainty is inherent to all scientific realms, its
inclusion into the decision-making process is not only desirable, but
essential. Indeed, Bayesian inference has been successfully applied in a wide
range of situations, such as cracking German enigmas during World War II, searching
for the black box of the Air France aircraft that felt in the middle of the
Atlantic Ocean in 2009, artificial intelligence, in courtrooms, and medicine[2]. Bayesian
approaches has also shown to be a powerful tool in environmental sciences, as
it is possible to incorporate all types of uncertainties in conservation and
management decisions, and hence preventing possible catastrophes that might be
irreversible.

As mentioned at the beginning of this post, the intention was not to
discuss exhaustively all topics concerning the debate around frequentist and
Bayesian inference. In fact, a wide range of interesting books and articles
exposes an in deep discussion of this issue, highlighting pros and cons. If you
are interested in further details on the incorporation of Bayesian inference
into the ecological context, we recommend the articles by
Dennis (1996), Ellison (2004), Clark (2005), and Cressie

*et al*. (2009). If you are interested in more theoretical issues, you may refer to Jaynes (2003), McCarthy (2007), Gelman*et al*. (2013), and Kruschke (2014).
[1] Though this Theorem is named after the legacy left by the English
Reverend Thomas Bayes, its practical development and application into scientific issues was
mainly done by the French scientist Pierre S. Laplace – that’s also why some claims that the Theorem should be rather named
as the Bayes-Laplace Theorem!

[2] For an easy-go reading about the historical application of Bayesian inference, the book

Clark, J.S. 2005. Why environmental scientists are becoming Bayesians. Ecol. Let., 8: 2-14.

Cressie, N.; Calder, C.A., Clark, J.S., Ver Hoef, J.M. & Wilke, C.K., 2009. Accounting for uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical modeling. Ecol. Appl., 19, 553-570.

*The theory that would not die: how Baye’s rule cracked the enigma code, hunted down Russian submarines & emerged triumphant from two centuries of*controversy, written by Sharon B. McGrayne, is highly recommended.

**References**Clark, J.S. 2005. Why environmental scientists are becoming Bayesians. Ecol. Let., 8: 2-14.

Cressie, N.; Calder, C.A., Clark, J.S., Ver Hoef, J.M. & Wilke, C.K., 2009. Accounting for uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical modeling. Ecol. Appl., 19, 553-570.

Dennis, B. 1996. Should ecologists become Bayesians? Ecol. Appl., 6: 1095-1103.

Ellison, A.M. 2004. Bayesian inference in ecology. Ecol. Let., 7: 509-520

Gelman, A.; Carlin, J.B.; Stern, H.S.; Dunson, D. B.; Vehtari, A. & Rubin, D. B. 2013. Bayesian data analysis, Chapman & Hall/CRC Press, 675p.

Jaynes, E.T. 2003. Probability Theory – The Logic of Science. Cambridge University Press, 727p.

Kinas, P.G. & Andrade, H.A. 2010. Introdução à Análise Bayesiana (com R). maisQnada, 240p.

Kruschke, J.K. 2014. Doing Bayesian data analysis: a tutorial with R, JAGS and Stand. Elsevier, 759p.

McCarthy, M.A. 2007. Bayesian methods for Ecology. Cambridge University Press, 306p.

*By Marie-Christine Rufener*