Law as Data pp. 73–79
DOI: 10.37911/9781947864085.04
4. Prediction Before Inference
Author: Allen Riddlell, Indiana University Bloomington
Excerpt
Competing probabilistic models of past events can always be evaluated in terms of how well they predict (“retrodict”) events using a measure of out-of-sample predictive accuracy. Are there settings where it is worthwhile to devote time and energy to developing models designed, in particular, to identify causal effects from observational data? Liters of ink have been spilled debating this question. In the specific common case of searching for credible probabilistic narratives of patterns in observational data—what Gelman (2011) labels “reverse causal inference”—insisting on formal models of causal inference is unhelpful. (Those designing and conducting field experiments, by contrast, should concern themselves with causal inference.) Models which aim to describe associations or make predictions without aiming explicitly at causal inference are often useful. Even when they are not practically useful, evaluating competing models of past events in terms of predictive performance is frequently valuable work and, in the final accounting, essential. The spectre of human and machine error as well as questionable research practices (e.g., p-hacking, HARKing, and publication bias) requires that all analyses, including those which claim to have made causal inferences, be evaluated using measures of usefulness and reliability which look beyond immediate formal proprieties of models.
The comments presented here arise from an interest in learning from observations of past events that can be described quantitatively and used in probabilistic models. Because credible narratives of past events often complement each other—someone else’s findings may help members in another intellectual community—everyone benefits when researchers in the social sciences and humanities use data-intensive methods that yield inferences that stand up to scrutiny. Narratives which prove, in retrospect, to be unreliable risk diverting time and resources away from more productive pursuits. Recent experience has shown that it is not a foregone conclusion that one learns much from an arbitrary piece of published research (Ioannidis 2005; Angrist and Pischke 2010; Camerer et al. 2016; Munafò et al. 2017; Ioannidis, Stanley, and Doucouliagos 2017). In fact, the false discovery rate in social science research featured in prestige venues such as Nature and Science appears to be higher than 30% (Camerer et al. 2018).
In this context the question of what sorts of methods should be allowed, encouraged, or discouraged is one of general interest. So, too, is the question of whether or not the use of certain classes of models yields reliable characterizations of past events.
Bibliography
Angrist, J. D., and J.-S. Pischke. 2010. “The Credibility Revolution in Empirical Economics: How Better Research Design is Taking the Con out of Econometrics.” Journal of Economic Perspectives 24 (2): 3–30.
Buntine, W. 2009. “Estimating Likelihoods for Topic Models.” In Advances in Machine Learning: First Asian Conference on Machine Learning, 51–64. Berlin, Germany: Springer.
Camerer, C. F., A. Dreber, E. Forsell, T.-H. Ho, J. Huber, M. Johannesson, M. Kirchler, J. Almenberg, A. Altmejd, and T. Chan. 2016. “Evaluating Replicability of Laboratory Experiments in Economics.” Science 351 (6280): 1433–1436.
Camerer, C. F., A. Dreber, F. Holzmeister, T.-H. Ho, J. Huber, M. Johannesson, M. Kirchler, et al. 2018. “Evaluating the Replicability of Social Science Experiments in Nature and Science between 2010 and 2015.” Nature Human Behaviour 2 (9): 637–644.
Gelman, A. 2011. “Causality and Statistical Learning.” American Journal of Sociology 117 (3): 955–966.
Gelman, A., and G. Imbens. 2013. “Why Ask Why? Forward Causal Inference and Reverse Causal Questions.” NBER Working Paper No. 19614, National Bureau of Economic Research, Cambridge, MA. https://www.nber.org/papers/w19614.
Ioannidis, J. P. A. 2005. “Why Most Published Research Findings Are False.” PLoS Medicine 2 (8): e124 (696–701).
Ioannidis, J., T. D. Stanley, and H. Doucouliagos. 2017. “The Power of Bias in Economics Research.” Economic Journal 127 (605): F236–F265.
Munafò, M. R., B. A. Nosek, D. V. M. Bishop, K. S. Button, C. D. Chambers, N. P. du Sert, U. Simonsohn, E.-J. Wagenmakers, J. J. Ware, and J. P. A. Ioannidis. 2017. “A Manifesto for Reproducible Science.” Article number 0021, Nature Human Behaviour 1 (1).
Press, S. J. 2009. Subjective and Objective Bayesian Statistics: Principles, Models, and Applications. Hoboken, NJ: John Wiley & Sons.
Rubin, D. B. 1974. “Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies.” Journal of Educational Psychology 66 (5): 688–701.
Silberzahn, R., E. L. Uhlmann, D. P. Martin, P. Anselmi, F. Aust, E. Awtrey, S. Bahník, et al. 2018. “Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results.” Advances in Methods & Practices in Psychological Science 1 (3): 337–356.
Splawa-Neyman, J. 1990. “On the Application of Probability Theory to Agricultural Experiments. Essay on Principles. Section 9.” Translated by D. M. Dabrowska and T. P. Speed. Statistical Science 5 (4): 465–472.
Vehtari, A., A. Gelman, and J. Gabry. 2017. “Practical Bayesian Model Evaluation Using Leave-One-out Cross-Validation and WAIC.” Statistics & Computing 27 (5): 1413–1432.
Young, A. 2017. “Consistency without Inference: Instrumental Variables in Practical Application.” Working paper, London School of Economics, London, UK. https://personal.lse.ac.uk/YoungA/ConsistencywithoutInference.pdf.