ABSTRACT
In this study, we assess ChatGPT-4’s ability in terms of accuracy, appropriateness of support, and consistency by applying it to a sizable number of case questions within the Deloitte Trueblood Case Study series. We contribute to the literature in three ways. First, we evaluate ChatGPT-4 on its ability to provide open-ended responses to realistic case study questions. Second, we ask ChatGPT-4 to not only answer the case questions but also provide the appropriate support from the relevant FASB standard. Finally, we run the case questions through ChatGPT-4 three times, therefore assessing the variation in ChatGPT-4 responses. Our results show that ChatGPT-4’s ability to accurately answer and support the case questions with consistency is not at levels that would be expected by accounting professionals. ChatGPT-4’s performance indicates that it may not be ready for more advanced accounting applications or even be relied upon for supplementary support by an accountant.