Purpose
– The purpose of this paper is to assess the reliability of numerical ratings of hotels calculated by three sentiment analysis algorithms.
Design/methodology/approach
– More than one million reviews and numerical ratings of hotels in seven cities in four countries were extracted from TripAdvisor web site. Reviews were classified as positive or negative using three sentiment analysis tools. The percentage of positive reviews was used to predict numerical ratings that were then compared with actual ratings.
Findings
– All tools classified reviews as positive or negative in a way that correlated positively with numerical ratings. More complex algorithms worked better, yet predicted ratings showed reasonable agreement with actual ratings for most cities. Predictions for hotels were less reliable if based on less than 50-60 percent of available reviews.
Practical implications
– These results validate that sentiment analysis can be used to transform unstructured qualitative data on user opinion into quantitative ratings. Current tools may be useful for summarizing opinions of user reviews of products and services on web sites that do not require users to post numerical ratings such as traveler forums. This summarizing may be valuable not just to potential users, but also to the service and product providers and offers validation and benchmarking for future improvement of opinion mining and prediction techniques.
Originality/value
– This work assesses the correlation between sentiment analysis of hotels’ reviews and their actual ratings. The authors also evaluated the reliability of results of sentiment analysis calculated by three different algorithms.
Este libro es producto del proyecto Estrategias empresariales de resiliencia tecnológica, creatividad remota e innovación como respuesta a la contingencia sanitaria por Covid-19, el cual presenta una serie de estudios coordinados por investigadores de distintas regiones de México, quienes buscaron identificar las estrategias de resiliencia empresarial adecuadas ante tal situación, entre las que destacan el uso de las tecnologías de la información y la comunicación como medio para mitigar los efectos del confinamiento mundial por la pandemia.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.