We present an approach for the joint extraction of entities and relations in the context of opinion recognition and analysis. We identify two types of opinion-related entities -expressions of opinions and sources of opinions -along with the linking relation that exists between them. Inspired by , we employ an integer linear programming approach to solve the joint opinion recognition task, and show that global, constraint-based inference can significantly boost the performance of both relation extraction and the extraction of opinion-related entities. Performance further improves when a semantic role labeling system is incorporated. The resulting system achieves F-measures of 79 and 69 for entity and relation extraction, respectively, improving substantially over prior results in the area.
In this paper, we take a detailed look at the performance of components of an idealized question answering system on two different tasks: the TREC Question Answering task and a set of reading comprehension exams. We carry out three types of analysis: inherent properties of the data, feature analysis, and performance bounds. Based on these analyses we explain some of the performance results of the current generation of Q/A systems and make predictions on future work. In particular, we present four findings: (1) Q/A system performance is correlated with answer repetition; (2) relative overlap scores are more effective than absolute overlap scores; (3) equivalence classes on scoring functions can be used to quantify performance bounds; and (4) perfect answer typing still leaves a great deal of ambiguity for a Q/A system because sentences often contain several items of the same type.
Creating and maintaining a platform for reliably producing and deploying machine learning models requires careful orchestration of many components-a learner for generating models based on training data, modules for analyzing and validating both data as well as models, and finally infrastructure for serving models in production. This becomes particularly challenging when data changes over time and fresh models need to be produced continuously. Unfortunately, such orchestration is often done ad hoc using glue code and custom scripts developed by individual teams for specific use cases, leading to duplicated effort and fragile systems with high technical debt.We present TensorFlow Extended (TFX), a TensorFlowbased general-purpose machine learning platform implemented at Google. By integrating the aforementioned components into one platform, we were able to standardize the components, simplify the platform configuration, and reduce the time to production from the order of months to weeks, while providing platform stability that minimizes disruptions.We present the case study of one deployment of TFX in the Google Play app store, where the machine learning models are refreshed continuously as new data arrive. Deploying TFX led to reduced custom code, faster experiment cycles, and a 2% increase in app installs resulting from improved data and model analysis.
Opinions are ubiquitous in text, and readers of online text—from consumers to sports fans to news addicts to governments—can benefit from automatic methods that synthesize useful opinion-oriented information from the sea of data. In this chapter on opinion mining and sentiment analysis, we introduce an idealized, end-to-end opinion analysis system and describe its components. We present methods for classifying documents and text passages according to their sentiment as well as methods that perform more fine-grained extraction of opinion expressions, their holders and their targets. We also address supplementary tasks of opinion lexicon construction, opinion summarization, opinion-oriented question answering, multi-lingual sentiment analysis and compositional approaches to phrase-level sentiment analysis.
This paper describes initial work on Deep Read, an automated reading comprehension system that accepts arbitrary text input (a story) and answers questions about it. We have acquired a corpus of 60 development and 60 test stories of 3 rd to 6 th grade material; each story is followed by short-answer questions (an answer key was also provided). We used these to construct and evaluate a baseline system that uses pattern matching (bag-of-words) techniques augmented with additional automated linguistic processing (stemming, name identification, semantic class identification, and pronoun resolution). This simple system retrieves the sentence containing the answer 30-40% of the time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.