Automated discovery and analysis of customer opinions on the web holds a lot of promise for present-day practices of market research and customer relationship management. Opinion mining attempts to come up with ways to automatically analyse subjectivity expressed in natural language text. Previous research on the topic has shown that the overall subjectivity expressed in a document, such as a customer review, can be assessed with accuracy that is feasible in real-world applications. In this paper, we address the challenge of identification of customer opinions expressed towards specific features of a product, such as service quality and location of a hotel. The paper proposes and investigates a method to recognize the relationships between subjective expressions and references to features of a product. While the method has been evaluated on customer hotel reviews, it can potentially find application also in many tasks where concrete statements need to be extracted from documents on heterogeneous topics such as posts in forums, comments on blogs, or utterances in a chat room.
The identification of cognates has attracted the attention of researchers working in the area of Natural Language Processing, but the identification of false friends is still an under-researched area. This paper proposes novel methods for the automatic identification of both cognates and false friends from comparable bilingual corpora. The methods are not dependent on the existence of parallel texts, and make use of only monolingual corpora and a bilingual dictionary necessary for the mapping of co-occurrence data across languages. In addition, the methods do not require that the newly discovered cognates or false friends are present in the dictionary and hence are capable of operating on out-of-vocabulary expressions. These methods are evaluated on English, French, German and Spanish corpora in order to identify English-French, English-German, English-Spanish and French-Spanish pairs of cognates or false friends. The experiments were performed in two settings: (i) assuming 'ideal' extraction of cognates and false friends from plain-text corpora, i.e. when the evaluation data contains only cognates and false friends, and (ii) a real-world extraction scenario where cognates and false friends have to first be identified among words found in two comparable corpora in different languages. The evaluation results show that the R. Mitkov (B) · V. Pekar · A. Mulloni 123 30 R. Mitkov et al.developed methods identify cognates and false friends with very satisfactory results for both recall and precision, with methods that incorporate background semantic knowledge, in addition to co-occurrence data obtained from the corpora, delivering the best results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.