Applying natural language processing for mining and intelligent information access to tweets (a form of microblog) is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-dependent, and dynamic nature. Information extraction from tweets is typically performed in a pipeline, comprising consecutive stages of language identification, tokenisation, part-of-speech tagging, named entity recognition and entity disambiguation (e.g. with respect to DBpedia). In this work, we describe a new Twitter entity disambiguation dataset, and conduct an empirical analysis of named entity recognition and disambiguation, investigating how robust a number of state-of-the-art systems are on such noisy texts, what the main sources of error are, and which problems should be further investigated to improve the state of the art.
Abstract. In this paper, we discuss a variety of issues related to opinion mining from microposts, and the challenges they impose on an NLP system, along with an example application we have developed to determine political leanings from a set of pre-election tweets. While there are a number of sentiment analysis tools available which summarise positive, negative and neutral tweets about a given keyword or topic, these tools generally produce poor results, and operate in a fairly simplistic way, using only the presence of certain positive and negative adjectives as indicators, or simple learning techniques which do not work well on short microposts. On the other hand, intelligent tools which work well on movie and customer reviews cannot be used on microposts due to their brevity and lack of context. Our methods make use of a variety of sophisticated NLP techniques in order to extract more meaningful and higher quality opinions, and incorporate extra-linguistic contextual information.
In this paper we present recent work on GATE, a widely-used framework and graphical development environment for creating and deploying Language Engineering components and resources in a robust fashion. The GATE architecture has facilitated the development of a number of successful applications for various language processing tasks (such as Information Extraction, dialogue and summarisation), the building and annotation of corpora and the quantitative evaluations of LE applications. The focus of this paper is on recent developments in response to new challenges in Language Engineering: Semantic Web, integration with Information Retrieval and data mining, and the need for machine learning support.
Abstract. Business Intelligence (BI) requires the acquisition and aggregation of key pieces of knowledge from multiple sources in order to provide valuable information to customers or feed statistical BI models and tools. The massive amount of information available to business analysts makes information extraction and other natural language processing tools key enablers for the acquisition and use of that semantic information. We describe the application of ontology-based extraction and merging in the context of a practical e-business application for the EU MUSING Project where the goal is to gather international company intelligence and country/region information. The results of our experiments so far are very promising and we are now in the process of building a complete end-to-end solution.
Background Libman-Sacks endocarditis, characterized by Libman-Sacks vegetations, is common in patients with systemic lupus erythematosus (SLE), and is commonly complicated with embolic cerebrovascular disease. Thus, accurate detection of Libman-Sacks vegetations may lead to early therapy and prevention of their associated complications. Although two-dimensional transesophageal echocardiography (2D-TEE) has high diagnostic value for detection of Libman-Sacks vegetations, three-dimensional TEE (3D-TEE) may allow improved detection, characterization, and clinical correlations of Libman-Sacks vegetations. Methods 29 SLE patients (27 women, age 34±12 years) prospectively underwent 40 paired 3D-TEE and 2D-TEE studies and assessment of cerebrovascular disease manifested as acute clinical neurologic syndromes, neurocognitive dysfunction, or focal brain injury on MRI. Initial and repeat studies in patients were intermixed in a blinded manner with paired studies from healthy controls, de-identified, coded, and independently interpreted by experienced observers unaware of patients’ clinical and imaging data. Results 3D-TEE as compared to 2D-TEE studies were more often positive for mitral or aortic valve vegetations, detected more vegetations per study, and determined larger size of vegetations (all p≤0.03). Also, 3D-TEE detected more vegetations on the anterior mitral leaflet, anterolateral and posteromedial scallops, and ventricular side or both atrial and ventricular sides of the leaflets (all p<0.05). In addition, 3D-TEE detected more vegetations on the aortic valve left and non-coronary cusps, coronary cusps’ tip and margins, and aortic side or both aortic and ventricular sides of the cusps (all p≤0.01). Furthermore, 3D-TEE detected more often associated mitral or aortic valves’ commissural fusion (p=0.002). Finally, 3D-TEE detected more vegetations in patients with cerebrovascular disease (p=0.01). Conclusion 3D-TEE provides clinically relevant additive information that complements 2D-TEE for the detection, characterization, and association with cerebrovascular disease of Libman-Sacks endocarditis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.