January 21, 2014

**[Update Oct 2014: Due to some changes to the Bayes factor calculator webpage, and as I understand BFs much better now, this post has been updated …]**

I started to familiarize myself with Bayesian statistics. In this post I’ll show some insights I had about *Bayes factors* (BF).

Bayes factors provide a numerical value that quantifies how well a hypothesis predicts the empirical data relative to a competing hypothesis. For example, if the BF is 4, this indicates: “This empirical data is 4 times more probable if H₁ were true than if H₀ were true.” . Hence, evidence points towards H₁. A BF of 1 means that data are equally likely to be occured under both hypotheses.

More formally, the BF can be written as:

where D is the data. Hence, the BF is a ratio of probabilities, and is related to larger class of likelihood-ratio test.

What researchers usually are interested in is not p(Data | Hypothesis), but rather p(Hypothesis | Data). Using Bayes’ theorem, the former can be transformed into the latter by assuming prior probabilities for the hypotheses. The BF then tells one how to update one’s prior probabilities after having seen the data, using this formula (Berger, 2006):

Given a BF of 1, one does not have to update his or her priors. If one holds, for example, equal priors (p(H1) = p(H0) = .5), these probabilities do not change after having seen the data of the original study.

The best detailed introduction of BFs I know of can be be found in Richard Morey’s blog posts [1] [2][3]. Also helpful is the ever-growing tutorial page for the BayesFactor package. (For other introductions to BFs, see Wikipedia, Bayesian-Inference, the classic paper by Kass and Raftery, 1995, or Berger, 2006).

Although many authors agree about the many theoretical advantages of BFs, until recently it was complicated and unclear how to compute a BF even for the simple standard designs (Rouder, Morey, Speckman, & Province, 2012). Fortunately, over the last years *default Bayes factors* for several standard designs have been developed (Rouder et al., 2012; Rouder, Speckman, Sun, Morey, & Iverson, 2009; Morey & Rouder, 2011). For example, for a two-sample *t* test, a BF can be derived simply by plugging the *t* value and the sample sizes into a formula. The BF is easy to compute by the R package BayesFactor (Morey & Rouder, 2013), or by online calculators [1][2].

When I started to familiarize myself with BFs, I was sometimes confused, as the same number seemed to mean different things in different publications. And indeed, **four types of Bayes factors** can be distinguished. “Under the hood”, all four types are identical, but you have to be aware which type has been employed in the specific case.

The first distinction is, whether the BF indicates “ over ” (=), or “ over ” (=). A of 2 means “Data is 2 times more likely to be occured under than under “, while the same situation would be a of 0.5 (i.e., the reciprocal value 1/2). Intuitively, I prefer larger values to be “better”, and as I usually would like to have evidence for instead of , I usually prefer the . However, if your goal is to show evidence *for* the H0, then is easier to communicate (compare: “Data occured 0.1 more likely under the alternative” vs. “Data show 10 times more evidence for the null than for the alternative”).

The second distinction is, whether one reports the raw BF, or the natural logarithm of the BF (The log(BF) has also been called “*weight of evidence*“; Good, 1985). The logarithm has the advantage that the scale above 1 (evidence for ) is identical to the scale below 1 (evidence for ). In the previous example, a of 2 is equivalent to a of 0.5. Taking the log of both values leads to = 0.69 and = -0.69: Same value, reversed sign. This makes the log(BF) ideal for visualization, as the scale is linear in both directions. Following graphic shows the relationship between raw/log BF:

As you can see in the Table of Figure 1, different authors use different flavors. This often makes sense, as we sometimes want to communicate evidence for the H1, and sometimes for the H0. However, for the uninitiated it can be sometimes confusing.

Usually, tables in publication report the raw BF (raw- or raw+). Plots, in contrast, typically use the log scale, for example:

Figure 2 shows conversion paths of the different BF flavors:

The user interface functions of the BayesFactor package always print the raw . Internally, however, the BF is stored as log().

Hence, you have to be careful when you directly use the backend utility functions, such as `ttest.tstat`

. These functions return the log(). As the conversion table shows, you have to `exp()`

that number to get the raw BF. Check the documentation of the functions if you are unsure which flavor is returned.

Related posts: Exploring the robustness of Bayes Factors: A convenient plotting function

Berger, J. O. (2006). Bayes factors. In S. Kotz, N. Balakrishnan, C. Read, B. Vidakovic, & N. L. Johnson (Eds.), *Encyclopedia of Statistical Sciences, vol. 1 (2nd ed.)* (pp. 378–386). Hoboken, NJ: Wiley.

Good, I. J. (1985). Weight of evidence: A brief survey. In J. M. Bernardo, M. H. DeGroot, D. V. Lindley, & A. F. M. Smith (Eds.), *Bayesian Statistics 2* (pp. 249–270). Elsevier.

Morey, R. D. & Rouder, J. N. (2011). Bayes factor approaches for testing interval null hypotheses. Psychological Methods, 16(4), 406–419. PMID: 21787084. doi:10.1037/a0024377

Morey, R. D. & Rouder, J. N. (2013). {BAYESFACTOR}: computation of bayes factors for common designs. R package version 0.9.4. Retrieved from http://CRAN.R- project.org/package=BayesFactor

Rouder, J. N., Morey, R. D., Speckman, P. L., & Province, J. M. (2012). Default bayes factors for ANOVA designs. Journal of Mathematical Psychology, 56(5), 356–374. doi:10.1016/j.jmp.2012.08.001

Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D., & Iverson, G. (2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16(2), 225–237.

Comments (12) | Trackback