by Angelika Stefan & Felix Schönbrodt
When reading about Bayesian statistics, you regularly come across terms like “objective priors“, “prior odds”, “prior distribution”, and “normal prior”. However, it may not be intuitively clear that the meaning of “prior” differs in these terms. In fact, there are two meanings of “prior” in the context of Bayesian statistics: (a) prior plausibilities of models, and (b) the quantification of uncertainty about model parameters. As this often leads to confusion for novices in Bayesian statistics, we want to explain these two meanings of priors in the next two blog posts*. The current blog post covers the the first meaning of priors (link to part II).
In this context, the term “prior” incorporates the personal assumptions of a researcher on the probability of a hypothesis (p(H1)) relative to a competing hypothesis, which has the probability p(H2). Hence, the meaning of this prior is “how plausible do you deem a model relative to another model before looking at your data”. The ratio of the two priors of the models, that is “how probable do you consider H1 compared to H2”, is called “prior odds”: p(H1) / p(H2).
The first meaning of priors is used in the context of Bayes factor analysis, where you compare two different hypotheses. In Bayes factor analysis, prior odds are updated by the likelihood ratio of the two hypotheses, which contains the information from the data, and result in the posterior odds (“what you believe after looking at your data”):
The prior belief is called “subjective”, but this label does not imply that it is “arbitrary”, “unprincipled”, or “irrational”. In contrast, the prior belief can (and preferably should) be informed by previous data or experiences. For example, it can be a belief that started with an equipoise (50/50) position, but has been repeatedly updated by data. But within the bounds of rationality and consistency, people still can come to considerably different prior beliefs, and might have good arguments for their choice – that’s why it is called “subjective”. But initially differing degrees of belief will converge as more and more evidence comes in. We will observe this in the following example.
The classical experiment of tasting tea has already been described in the context of Bayesian hypothesis testing by Lindley (1993). We will present a simpler form here. Dr. Muriel Bristol, a scientist working in the field of alga biology who was acquainted to the statistician R. A. Fisher, claimed that she could discriminate whether milk is put in a cup before or after the tea infusion during the process of preparing tea with milk. However, Mr. Fisher considered this very unlikely.
So they decided to run an experiment: Muriel Bristol tasted several cups of tea in a row, making a guess on the preparation procedure for each cup. Unlike in the original story, where inferential statistics were consulted to solve the disagreement, we will employ Bayesian statistics to track how prior convictions in this example change. If Muriel Bristol makes her guesses only based on chance as Mr. Fisher supposes, she has a probability of success of 50% in each trial. Before observing her performance, Mr. Fisher should therefore consider it very likely that she is right about the procedure in about 50% of the trials across all trials. We can therefore assume a point hypothesis: HFisher: Success rate (SR) = 0.5. Muriel Bristol, on the other hand, is very confident in her divination skills. She claims to get 80% of trials correct, which can be equally captured in a point hypothesis: HMuriel: SR = 0.8.
To introduce prior beliefs about hypotheses and show how they change with upcoming evidence, we want to introduce two additional persons. The first one is a slightly skeptical observer who tends to favor HFisher, but does not completely rule out that Mrs. Bristol could be right with her hypothesis. More formally, we could describe this position as: P(HFisher) = 0.6 and P(HMuriel) = 0.4. This means that his prior odds are P(HFisher)/P(HMuriel) = 3:2. Fisher’s hypothesis is 1.5 times more likely to him than Muriel Bristol’s hypothesis.
The second additional person we would like to introduce is William, Muriel Bristol’s loving husband who fervently advocates her position. He knows his wife would never (or at least very rarely) make wrong claims, concerning tea preparation and all others issues of their marriage. He therefore assigns a much higher subjective probability to her hypothesis (P(HMuriel) = 0.9) than to the one of Mr. Fisher (P(HFisher) = 0.1). His prior odds are therefore P(HFisher)/P(HMuriel) = 1:9. Please note that the content of the hypotheses (the proposed success rates 0.5 and 0.8, which are the parameters of the model) is logically independent of the probability of the hypotheses (priors) that our two observers have.
During the process of hypothesis testing, these two priors are updated with the existing evidence. It is reported that Muriel Bristol’s performance at the experiment was extraordinarily good. We therefore assume that out of the first 6 trials of the experiment she got 5 correct.
With this information, we can now compute the likelihood of the data under each of the hypotheses (for more information on the computation of likelihoods, see Alexander Etz’s blog:
The computation of the likelihoods does not involve the prior model probabilities of our observers. What can be seen is that the data are more likely under Muriel Bristol’s hypothesis than under Mr. Fisher’s. This should not come as a surprise as Muriel Bristol claimed that she could make a very high percentage of right guesses and the data show a very high percentage of right guesses whereas Mr. Fisher assumed a much lower percentage of right guesses. To emphasize this difference in likelihoods and to assign it a numerical value, we can compute the likelihood ratio (Bayes factor):
This ratio means that the data are 4.19 times as likely under Mrs. Bristol’s hypothesis as under Mr. Fisher’s hypothesis. It does not matter how you order the likelihoods in the fraction, the meaning remains constant (see this blog post).
How does this likelihood change the prior odds of both our slightly skeptical observer and William Bristol? Bayes theorem shows that prior odds can be updated by multiplying them with the likelihood ratio (Bayes factor):
First, we will focus on the posterior odds of the slightly skeptical observer. To remember, the slightly skeptical observer had assigned a probability of 0.6 to Mr. Fisher’s hypothesis and a probability of 0.4 to Muriel Bristol’s hypothesis before seeing the data, which resulted in prior odds of 3:2 for Mr. Fisher’s hypothesis. How do these convictions change now when the experiment has conducted? To examine this, we simply have to insert all known values in the equation:
This shows that the prior odds of the slightly skeptical observer changed from 3:2 to posterior odds of 1:2.8. This means that whereas before the experiment the slightly skeptical observer had deemed Mr. Fisher’s hypothesis more plausible than Mrs. Bristol’s hypothesis, he changed his opinion after seeing the data, now preferring Mrs. Bristol’s hypothesis over Mr. Fisher’s.
The same equation can be applied to William Bristol’s prior odds:
What we can notice is that after taking the data into consideration both prior odds display a higher amount of agreement with Muriel Bristol and reduced confidence in Mr. Fisher’s hypothesis. Whereas the convictions of the slightly skeptical observer were changed in favor of Muriel Bristol’s hypothesis after the experiment, William Bristol’s prior convictions were strengthened.
Something else you can notice is that compared to William Bristol the slightly skeptical observer still assigns a higher plausibility to Mr. Fisher’s hypothesis. This rank order between the two priors will remain no matter what the data look like. Even if Muriel Bristol made, say, 100/100 correct guesses, the slightly skeptical observer would trust less in her hypothesis than her husband. However, with increasing evidence the absolute difference between both observers will decrease more and more.
This blog post explained the first meaning of “prior” in the context of Bayesian statistics. It can be defined as the subjective plausibility a researcher assigns to a hypothesis compared to another hypothesis before seeing the data. As illustrated in the tea-tasting example, these prior beliefs are updated with upcoming evidence in the research process. In the next blog post, we will explain a second meaning of “priors”: The quantification of uncertainty about model parameters.
We want to thank Eric-Jan Wagenmakers for helpful comments on a previous version of the post.
*As a note: Both meanings in fact can be unified, but for didactic purpose we think it makes sense to keep them distinct as a start.
Dienes, Z. (2011). Bayesian versus orthodox statistics: Which side are you on?. Perspectives On Psychological Science, 6(3), 274-290. http://doi:10.1177/1745691611406920
Dienes, Z. (2016). How Bayes factors change scientific practice. Journal Of Mathematical Psychology, 7278-89. http://doi:10.1016/j.jmp.2015.10.003
Lindley, D. V. (1993). The analysis of experimental data: The appreciation of tea and wine. Teaching Statistics, 15(1), 22-25. http://dx.doi.org/10.1111/j.1467-9639.1993.tb00252.x
Rouder, J. N., Morey, R. D., Verhagen, J., Province, J. M., & Wagenmakers, E. J. (2016a). Is there a free lunch in inference? Topics in Cognitive Science, 8, 520–547. http://doi.org/10.1111/tops.12214
Rouder, J. N., Morey, R. D., & Wagenmakers, E. J. (2016b). The Interplay between Subjectivity, Statistical Practice, and Psychological Science. Collabra, 2(1), 6–12. http://doi.org/10.1525/collabra.28
I’m honoured that the Berkeley Initiative for Transparency in the Social Sciences (BITSS) chose me for one of the 2016 Leamer-Rosenthal Prizes for Open Social Science! This award comes with a prize of $10,000 and “recognizes important contributions by individuals in the open science movement”. For my contributions to Open Science, see this website. For more details about the price, which is generously donated by the John Templeton Foundation, see here.
BITSS has become one of the central hubs for the global open science movement and does a great job by providing open educational resources (e.g., “Tools and Resources for Data Curation”, or “How to Write a Pre-Analysis Plan”), grants, running the open science catalysts program, and hosting their annual meeting in San Francisco.
I am optimistically looking into a future of credible, reproducible, and transparent research. Stay tuned for some news from our work here at the department’s Open Science Committee at LMU Munich!
From 17th to 22th September, the 50th anniversary congress of the German psychological association takes place in Leipzig. On previous conferences in Germany in the last two or three years, the topic of the credibility crisis and research transparency has been sometimes covered, sometimes completely ignored.
Therefore I am quite happy that this topic now has a really prominent place at the current conference. Here’s a list of some talks and events focusing on Open Science, research transparency, and what a future science could look like – see you there!
From the abstract:
In this symposium we discuss issues related to reproducibility and trust in psychological science. In the first talk, Jelte Wicherts will present some empirical results from meta-science that perhaps lower the trust in psychological science. Next, Coosje Veldkamp will discuss results bearing on actual public trust in psychological science and in psychologists from an international perspective. After that, Felix Schönbrodt and Chris Chambers will present innovations that could strengthen reproducibility in psychology. Felix Schönbrodt will present Sequential Bayes Factors as a novel method to collect and analyze psychological data and Chris Chambers will discuss Registered Reports as a means to prevent p-hacking and publication bias. We end with a general discussion.
For details on the talks, see here.
The currency of science is publishing. Producing novel, positive, and clean results maximizes the likelihood of publishing success because those are the best kind of results. There are multiple ways to produce such results: (1) be a genius, (2) be lucky, (3) be patient, or (4) employ flexible analytic and selective reporting practices to manufacture beauty. In a competitive marketplace with minimal accountability, it is hard to resist (4). But, there is a way. With results, beauty is contingent on what is known about their origin. With methodology, if it looks beautiful, it is beautiful. The only way to be rewarded for something other than the results is to make transparent how they were obtained. With openness, I won’t stop aiming for beautiful papers, but when I get them, it will be clear that I earned them.
Moderation: Manfred Schmitt
Discussants: Manfred Schmitt, Andrea Abele-Brehm, Klaus Fiedler, Kai Jonas, Brian Nosek, Felix Schönbrodt, Rolf Ulrich, Jelte Wicherts
When replicated, many findings seem to either diminish in magnitude or to disappear altogether, as, for instance, recently shown in the Reproducibility Project: Psychology. Several reasons for false-positive results in psychology have been identified (e.g., p-hacking, selective reporting, underpowered studies) and call for reforms across the whole range of academic practices. These range from (1) new journal policies promoting an open research culture to (2) hiring, tenure and funding criteria that reward credibility and replicability rather than sexiness and quantity to (3) actions for increasing transparent and open research practices within and across individual labs. Following Brian Nosek’s (Center of Open Science) keynote, titled “Addressing the Reproducibility of Psychological Science” this panel discussion aims to explore the various ways in which our field may take advantage of the current debate. That is, the focus of the discussion will be on effective ways of improving the quality of psychological research in the future. Seven invited discussants provide insights into different current activities aimed at improving scientific practice and will discuss their potential. The audience will be invited to contribute to the discussion.
I will represent the new guidelines of the German association for data management. They soon will be published, but here’s the gist: Open by default (raw data are an essential part of a publication); exceptions should be justified. Furthermore, we define norms for data reusage. Stay tuned on this blog for more details!
[…] In any case, a reanalysis of the data must result in similar or identical results.
[…] In this talk, we present a method that is well-suited for writing reproducible academic papers. This method is a combination of Latex, R, Knitr, Git, and Pandoc. These software tools are robust, well established and not more than reasonable complex. Additional approaches, such as using word processors (MS Word), Markdown, or online collaborative writing tools (Authorea) are presented briefly. The presentation is based on a practical software demonstration. A Github repository for easy reproducibility is provided.
These are not all sessions on the topic – go to http://www.dgpskongress.de/frontend/index.php# and search for “ASSURING THE QUALITY OF PSYCHOLOGICAL RESEARCH” to see all sessions associated with this topic. Furthermore, the CMS of the congress does not allow direct linking to the sessions, so you have to search for the sessions yourself.
Want to meet me at the conference? Write me an email, or send me a PM on Twitter.