Software development and maintenance require making many decisions over the lifetime of the software. The decision problems, alternative solutions, and the arguments for and against these solutions comprise the system's rationale. This information is potentially valuable as a record of the developer and maintainers' intent. Unfortunately, this information is not explicitly captured in a structured form that can be easily analyzed. Still, while rationale is not explicitly captured, that does not mean that rationale is not captured at all-decisions are documented in many ways throughout the development process. This paper tackles the issue of extracting rationale from text by describing a mechanism for using two existing tools, GATE (General Architecture for Text Engineering) and WEKA (Waikato Environment for Knowledge Analysis) to build classification models for text mining of rationale. We used this mechanism to evaluate different combinations of text features and machine learning algorithms to extract rationale from Chrome bug reports. Our results are comparable in accuracy to those obtained by human annotators.
This paper announces the release of a new version of the English lexical resource VerbNet with substantially revised semantic representations designed to facilitate computer planning and reasoning based on human language. We use the transfer of possession and transfer of information event representations to illustrate both the general framework of the representations and the types of nuances the new representations can capture. These representations use a Generative Lexicon-inspired subevent structure to track attributes of event participants across time, highlighting oppositions and temporal and causal relations among the subevents.
This paper describes the evolution of the Prop-Bank approach to semantic role labeling over the last two decades. During this time the Prop-Bank frame files have been expanded to include non-verbal predicates such as adjectives, prepositions and multi-word expressions. The number of domains, genres and languages that have been PropBanked has also expanded greatly, creating an opportunity for much more challenging and robust testing of the generalization capabilities of PropBank semantic role labeling systems. We also describe the substantial effort that has gone into ensuring the consistency and reliability of the various annotated datasets and resources, to better support the training and evaluation of such systems.
We implemented an end-to-end system for disorder identification and slot filling. For identifying spans for both disorders and their attributes, we used a linear chain conditional random field (CRF) approach coupled with cTAKES for pre-processing. For combining disjoint disorder spans, finding relations between attributes and disorders, and attribute normalization, we used l2-regularized l2-loss linear support vector machine (SVM) classification. Disorder CUIs were identified using a back-off approach to YTEX lookup (CUAB1) or NLM UTS API (CUAB2) if the target text was not found in the training data. Our best system utilized UMLS semantic type features for disorder/attribute span identification and the NLM UTS API for normalization. It was ranked 12th in Task 1 (disorder identification) and 6th in Task 2b (disorder identification and slot filling) with a weighted F Measure of 0.711.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.