Automated discovery and analysis of customer opinions on the web holds a lot of promise for present-day practices of market research and customer relationship management. Opinion mining attempts to come up with ways to automatically analyse subjectivity expressed in natural language text. Previous research on the topic has shown that the overall subjectivity expressed in a document, such as a customer review, can be assessed with accuracy that is feasible in real-world applications. In this paper, we address the challenge of identification of customer opinions expressed towards specific features of a product, such as service quality and location of a hotel. The paper proposes and investigates a method to recognize the relationships between subjective expressions and references to features of a product. While the method has been evaluated on customer hotel reviews, it can potentially find application also in many tasks where concrete statements need to be extracted from documents on heterogeneous topics such as posts in forums, comments on blogs, or utterances in a chat room.
The purpose of this study was to develop a method for automatic construction of multidocument summaries of sets of research abstracts that may be retrieved by a digital library or search engine in response to a user query. Sociology dissertation abstracts were selected as the sample domain in this study. A variable-based framework was proposed for integrating and organizing research concepts and relationships as well as research methods and contextual relations extracted from different dissertation abstracts. Based on the framework, a new summarization method was developed, which parses the discourse structure of abstracts, extracts research concepts and relationships, integrates the information across different abstracts, and organizes and presents them in a Web-based interface. The focus of this article is on the user evaluation that was performed to assess the overall quality and usefulness of the summaries. Two types of variable-based summaries generated using the summarization method - with or without the use of a taxonomy - were compared against a sentence-based summary that lists only the research-objective sentences extracted from each abstract and another sentence-based summary generated using the MEAD system that extracts important sentences. The evaluation results indicate that the majority of sociological researchers (70%) and general users (64%) preferred the variable-based summaries generated with the use of the taxonomy
This paper describes a new concept-based multi-document summarization system that employs discourse parsing, information extraction and information integration. Dissertation abstracts in the field of sociology were selected as sample documents for this study. The summarization process includes four major steps — (1) parsing dissertation abstracts into five standard sections; (2) extracting research concepts (often operationalized as research variables) and their relationships, the research methods used and the contextual relations from specific sections of the text; (3) integrating similar concepts and relationships across different abstracts; and (4) combining and organizing the different kinds of information using a variable-based framework, and presenting them in an interactive web-based interface. The accuracy of each summarization step was evaluated by comparing the system-generated output against human coding. The user evaluation carried out in the study indicated that the majority of subjects (70%) preferred the concept-based summaries generated using the system to the sentence-based summaries generated using traditional sentence extraction techniques
This chapter describes various text summarization techniques and evaluation techniques that have been proposed in literature and discusses the application of text summarization in digital libraries. First, it introduces the history of automatic text summarization and various types of summaries. Next, it reviews various approaches which have been used for single-document and multidocument summarization. Then, it describes the major evaluation approaches for assessing the generated summaries. Finally, it outlines the principal trends of the area of automatic text summarization. This chapter aims to help the reader to obtain a clear overview of the text summarization field and facilitate the application of text summarization in digital libraries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.