Many APSA 2016 panels and discussions in the Section on Qualitative and Multimethod Research and the Political Methodology Section were centered on the Data Access and Research Transparency (DART) Initiative (probably worth a blog post of its own). Even panels not explicitly dedicated to DART have digressed into this topic, which includes a short exchange between Tasha Fairfield and me in a panel on causal inference in qualitative research.
Preregistered Bayesian process tracing?
We discussed whether there is a need for preregistration if you do Bayesian process tracing. Admittedly, this seems to be a topic that is very remote from the actual process tracing research being done at the moment. However, a couple of panels at this and previous APSA meetings have shown that Bayesian process tracing is one of the emerging topics, if not the emerging topic currently taught by different people at different venues. It therefore seems a reasonable guess that we will see it in published process tracing studies over the next years.
Compared to Bayesian process tracing, preregistration seems to be an even more remote topic. Again, however, there is a “however” because it is an important issue in quantitative research across different fields (sociology, psychology, economics, political science) and there are discussions of whether it is also needed and applicable to qualitative research.
Needed or not?
On the “whether” side, Fairfield and I disagreed in principle on whether we need preregistration for Bayesian process tracing. I say “yes” because there are several research design decisions that could bias your study in the direction of making a positive finding. One obvious point is the priors of the hypotheses you want to test. If you gather mostly weak evidence in process tracing, meaning the likelihood ratio is not much different from 1 for most pieces of evidence, then you might want to fix the prior of your favorite hypothesis a little bit higher than you initially intended because this gives you a higher posterior.
Another point is the decision as to when to stop collecting evidence. You might plan to stay in an archive for, say, a week, but then decide to go home after three days because you’ve found a plethora of strong evidence in favor of your hypothesis and might not want to waste resources on four more days. This might seem like a reasonable decision, but one that differs from your initial plan. For me, this is worth making transparent because your practical stopping rule is made conditional on the quality of the evidence and you leave behind many sources that you may have meant to analyze before having seen some sources during the first three days. (You might ask yourself whether you would also go home if you mostly find strong evidence against your hypothesis during the first three days. If you say “no”, you see what the issue is here.)
Fairfield disagreed with me based on her belief in the numerous advantages of Bayesian inference (some of which I share and some, I don’t). I am afraid I might not be able to exactly reproduce their arguments with a time lag of about four weeks, but her arguments against preregistering priors seem two-fold. First, priors “wash out” as you accumulate evidence, which I do not like as an argument because it downplays the importance of the priors (and is not necessarily true, I’d say). Second, Bayesian inference is kind of an evidence-processing machine that is continuously running; it breaks down the distinction between induction and deduction and renders it unnecessary to fix priors at some point of your research at a specific level. On the question of the use of sources and the practical stopping rule, Fairfield’s view is that you just process the evidence that you have and the posteriors then are what they are.
Practice vs principles
It became clear to me only after the panel that we were talking on two different dimensions. I was arguing for preregistration because researchers are fallible; Fairfield argued against preregistration based on her belief in the inherent benefits of Bayesian inference and process tracing. Regardless of how valid my two specific points are, it holds that no method requires preregistration because the misuse of a method is not inherent to it. p-hacking is not a problem of frequentist statistics, but its misuse by researchers (you might believe p-values do not answer meaningful research questions, but this is unrelated to preregistration).
The same holds true for Bayesian process tracing. In principle, there are no problems and the method works flawlessly. In practice, it is done by researchers seeking a PhD or tenure, wanting to get published, or not understanding why decisions they make bias their results in one direction or the other. This includes the formulation of priors, the processing of sources and the interpretation of evidence.
Odysseus and the sirens
Consider yourself being Odysseus on the way to do proper research, but you are subject to the sirens’ call of publishing. You are tempted to give in and it does not matter what boat you are on because any boat can be directed toward the sirens. You can avoid this by tying yourself to the mast and preregistering your research design before you set sail.
To some degree, the Odysseus analogy fails because preregistration does not mean that you really restrain yourself. You just write down your research design and plan before you do the empirical analysis; there might be reasons to deviate from the preregistered decisions and you can do so, but it then becomes transparent what you intended to do and what you actually did for what reason. With regard to Odysseus, this means you could steer your boat to the sirens, but at least you cannot claim that you wanted to go there all along.