This work proposes a sequential tagger for named entity recognition in morphologically rich languages. Several schemes for representing the morphological analysis of a word in the context of named entity recognition are examined. Word representations are formed by concatenating word and character embeddings with the morphological embeddings based on these schemes. The impact of these representations is measured by training and evaluating a sequential tagger composed of a conditional random field layer on top of a bidirectional long short-term memory layer. Experiments with Turkish, Czech, Hungarian, Finnish and Spanish produce the state-of-the-art results for all these languages, indicating that the representation of morphological information improves performance.
Abstract. This paper presents an overview of the ImageCLEF 2014 evaluation lab. Since its first edition in 2003, ImageCLEF has become one of the key initiatives promoting the benchmark evaluation of algorithms for the annotation and retrieval of images in various domains, such as public and personal images, to data acquired by mobile robot platforms and medical archives. Over the years, by providing new data collections and challenging tasks to the community of interest, the ImageCLEF lab has achieved an unique position in the image annotation and retrieval research landscape. The 2014 edition consists of four tasks: domain adaptation, scalable concept image annotation, liver CT image annotation and robot vision. This paper describes the tasks and the 2014 competition, giving a unifying perspective of the present activities of the lab while discussing future challenges and opportunities.
Twitter is an extremely high volume platform for user generated contributions regarding any topic. The wealth of content created at real-time in massive quantities calls for automated approaches to identify the topics of the contributions. Such topics can be utilized in numerous ways, such as public opinion mining, marketing, entertainment, and disaster management. Towards this end, approaches to relate single or partial posts to knowledge base items have been proposed. However, in microblogging systems like Twitter, topics emerge from the culmination of a large number of contributions. Therefore, identifying topics based on collections of posts, where individual posts contribute to some aspect of the greater topic is necessary. Models, such as Latent Dirichlet Allocation (LDA), propose algorithms for relating collections of posts to sets of keywords that represent underlying topics. In these approaches, figuring out what the specific topic(s) the keyword sets represent remains as a separate task. Another issue in topic detection is the scope, which is often limited to specific domain, such as health. This work proposes an approach for identifying domain-independent specific topics related to sets of posts. In this approach, individual posts are processed and then aggregated to identify key tokens, which are then mapped to specific topics. Wikipedia article titles are selected to represent topics, since they are up to date, user-generated, sophisticated articles that span topics of human interest. This paper describes the proposed approach, a prototype implementation, and a case study based on data gathered during the heavily contributed periods corresponding to the four US election debates in 2012. The manually evaluated results (0.96 precision) and other observations from the study are discussed in detail.
Radiologists inspect CT scans and record their observations in reports to communicate with physicians. These reports may suffer from ambiguous language and inconsistencies resulting from subjective reporting styles, which present challenges in interpretation. Standardization efforts, such as the lexicon RadLex for radiology terms, aim to address this issue by developing standard vocabularies. While such vocabularies handle consistent annotation, they fall short in sufficiently processing reports for intelligent applications. To support such applications, the semantics of the concepts as well as their relationships must be modeled, for which, ontologies are effective. They enable the software to make inferences beyond what is present in the reports. This paper presents the open-source ontology onlira (Ontology of the Liver for Radiology), which is developed to support such intelligent applications, such as identifying and ranking similar liver patient cases. onlira is introduced in terms of its concepts, properties, and relations. Examples of real liver patient cases are provided for illustration purposes. The ontology is evaluated in terms of its ability to express real liver patient cases and address semantic queries.
Abstract. This paper presents an overview of the ImageCLEF 2016 evaluation campaign, an event that was organized as part of the CLEF (Conference and Labs of the Evaluation Forum) labs 2016. ImageCLEF is an ongoing initiative that promotes the evaluation of technologies for annotation, indexing and retrieval for providing information access to collections of images in various usage scenarios and domains. In 2016, the 14th edition of ImageCLEF, three main tasks were proposed: 1) identification, multi-label classification and separation of compound figures from biomedical literature; 2) automatic annotation of general web images; and 3) retrieval from collections of scanned handwritten documents. The handwritten retrieval task was the only completely novel task this year, although the other two tasks introduced several modifications to keep the proposed tasks challenging.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.