Abstract
Predictive modeling focuses on iteratively trying various combinations and transformations of a set of variables to generate a decision rule that predicts outcomes for new observations. Although accounting researchers have demonstrated interest in predictive modeling, we identify a lack of applied guidance on this topic for accounting settings. This issue has become more salient with the increasing availability of machine learning models that use unfamiliar terminology, are estimated using algorithms, and produce different outputs than other models used for causal inference. To overcome this gap, we provide an overview of how to predict discrete outcomes with logistic regression and machine learning models use in recent studies. We also include guidance and a comprehensive example – predicting investigations by the U.S. Securities and Exchange Commission – that illustrates the elements of the prediction process, highlighting the importance of out-of-sample accuracy and unique aspects in the presentation of a prediction model’s results.