The immense volume of web usage data that exists on web servers contains potentially valuable information about the behavior of website visitors. This information can be exploited in various ways, such as enhancing the effectiveness of websites or developing directed web marketing campaigns. In this paper we will focus on applying association rules as a data mining technique to extract potentially useful knowledge from web usage data.We conducted a comprehensive analysis of web usage association rules found on a website of an educational institution. Our experiments confirm that, prior to pruning, the set of generated association rules contained too many non-interesting rules, which made it very difficult for a user to find and exploit useful information. Many of these rules are a simple consequence of the high correlation between web pages due to their interconnectedness through the website link structure.We proposed and applied a set of basic pruning schemes to reduce the rule set size and to remove a significant number of non-interesting rules. This pruning method decreased the size of our experimental rule set by more than three times, making it much simpler to browse for truly interesting rules. The percentage of truly interesting rules, which can initiate a webmaster to actions that can potentially enhance the website and improve its browsing experience, in our resulting experimental rule set was 41%.The analysis of association rules in our case study confirmed the hypothesis that discovering interesting and potentially useful association rules in web usage data does not have to be a timeconsuming task and can lead to actions that increase the website's effectiveness.
Knowing what attracts or deters tourists to/from a tourist visit and what products to offer them and to pay special attention to is crucial for good economic results. Such knowledge can be obtained by analysis of online comments and reviews that tourists leave on travel websites (such as Booking, TripAdvisor, Trivago, etc.). This paper describes the value which information about opinions and emotions hidden in online reviews has for managers who receive it, especially the knowledge of (dis)satisfaction of users with certain aspects of the tourist offer. Uncovered knowledge from online reviews provides a chance to take advantage of the strong points, and correct the shortcomings through timely corrective measures and actions. Contemporary approaches and methods of analyzing online reviews and the opportunities for development they provide in the tourism industry are described through a case study conducted over a subset of 20491 hotel reviews from TripAdvisor. We have conducted sentiment analysis of reviews with the goal of building an automated model which will successfully distinguish positive from negative reviews. Logistic Regression classifier has the best performance, in 90% of reviews it has correctly classified positive reviews and in 83% negative. We have illustrated how association rules can help management to uncover relationships between concepts under discussion in negative and positive reviews.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.