ABSTRACT
We present a data cleaning project that utilizes real vendor master data of a large public university in the United States. Our main objective when developing this case was to identify the areas where students need guidance in order to apply a problem-solving approach to the project. This includes initial analysis of the data and the task at hand, planning for cleaning and testing activities, executing this plan, and communicating the results in a written report. We provide a dataset with 29K records of vendor master data and a subset of the same data with 800 records. The assignment has two parts—the planning and the actual cleaning, each with its own deliverable. It can be used in many different courses and completed with almost any data analytics software. We provide suggested solutions and detailed solution notes for Excel and for Alteryx Designer.