ABSTRACT
This paper presents the employee ratings and reviews data from Glassdoor and the R codes used to collect, clean, and organize the data. We collect three types of information for each Glassdoor review: review metrics, content, and reviewer information. We also calculate some commonly used textual metrics, such as sentiment, readability, the number of uncertainty words, etc. The datasets include necessary identifiers that can connect to other financial data sources. All the variables and metrics are provided at the review level, which enables researchers to aggregate the data from different levels and angles. As a demonstrative example, we use a simple word list to measure how often employees mention COVID-19 in review comments. The R codes provided as an RStudio project are self-contained and can also be modified and applied to other data sources of interest.