Data integrity in archival research is a necessary condition for ensuring the validity of research findings. When data are collected by a large commercial provider, such as Standard & Poor's, this aspect is often taken for granted. Yet any accounting researcher who has worked with Compustat and other large databases understands that there are gaps in the data despite the most judicious attempt by the provider to provide a complete, comprehensive, and error-free database.
Casey, Gao, Kirschenheiter, Li, and Pandit (2016; hereafter Casey et al.) provide a framework that is intended to improve data integrity when using Compustat data. The paper is appealing because it is a good example of how an understanding of the structural properties of the source data (i.e., the accounting system) can be used to fill in idiosyncratic gaps that occur in large-scale databases. The contribution of Casey et al. is two-fold. First, they provide...