Knowing the reputations of your own and/or competitors' products is important for marketing and customer relationship management. It is, however, very costly to collect and analyze survey data manually. This paper presents a new framework for mining product reputations on the Internet. It automatically collects people's opinions about target products from Web pages, and it uses text mining techniques to obtain the reputations of those products.On the basis of human-test samples, we generate in advance syntactic and linguistic rules to determine whether any given statement is an opinion or not, as well as whether such any opinion is positive or negative in nature. We first collect statements regarding target products using a general search engine, and then, using the rules, extract opinions from among them and attach three labels to each opinion, labels indicating the positive/negative determination, the product name itself, and an numerical value expressing the degree of system confidence that the statement is, in fact, an opinion. The labeled opinions are then input into an opinion database.The mining of reputations, i.e., the finding of statistically meaningful information included in the database, is then conducted. We specify target categories using label values (such as positive opinions of product A) and perform four types of text mining: extraction of 1) characteristic words, 2) co-occurrence words, 3) typical sentences, for individual target categories, and 4) correspondence analysis among multiple target categories.Actual marketing data is used to demonstrate the validity and effectiveness of the framework, which offers a drastic reduction in the overall cost of reputation analysis over that of conventional survey approaches and supports the discovery of knowledge from the pool of opinions on the web.
Syslog monitoring technologies have recently received vast attentions in the areas of network management and network monitoring. They are used to address a wide range of important issues including network failure symptom detection and event correlation discovery. Syslogs are intrinsically dynamic in the sense that they form a time series and that their behavior may change over time. This paper proposes a new methodology of dynamic syslog mining in order to detect failure symptoms with higher confidence and to discover sequential alarm patterns among computer devices.The key ideas of dynamic syslog mining are 1) to represent syslog behavior using a mixture of Hidden Markov Models, 2) to adaptively learn the model using an on-line discounting learning algorithm in combination with dynamic selection of the optimal number of mixture components, and 3) to give anomaly scores using universal test statistics with a dynamically optimized threshold. Using real syslog data we demonstrate the validity of our methodology in the scenarios of failure symptom detection, emerging pattern identification, and correlation discovery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.