Abstract:Objective
In recognition of potential barriers that may inhibit the widespread adoption of biomedical software, the 2014 i2b2 Challenge introduced a special track, Track 3—Software Usability Assessment, in order to develop a better understanding of the adoption issues that might be associated with the state-of-the-art clinical NLP systems. This paper reports the ease of adoption assessment methods we developed for this track, and the results of evaluating five clinical NLP system submissions.
Materials and M… Show more
“…The results obtained were be compared to those submitted by the teams using the same system. During this process, the analysts took notes on the various aspects of working with the systems (ease of installing and using, ease of understanding supplied instructions, success of the replication attempt), using a specific score sheet developed by the analysts, following some of the criteria evaluated by (Zheng et al, 2015). The score sheet comprised 10 questions addressing the experience of analysts at each stage of the experiment: system configuration, system installation, running the system, obtaining results, and overall impressions.…”
Section: Evaluation Of the Replication Experiencementioning
The scientific community is facing raising concerns about the reproducibility of research in many fields. To address this issue in Natural Language Processing, the CLEF eHealth 2016 lab offered a replication track together with the Clinical Information Extraction task. Herein, we report detailed results of the replication experiments carried out with the three systems submitted to the track. While all results were ultimately replicated, we found that the systems were poorly rated by analysts on documentation aspects such as "ease of understanding system requirements" (33%) and "provision of information while system is running" (33%). As a result, simple steps could be taken by system authors to increase the ease of replicability of their work, thereby increasing the ease of re-using the systems. Our experiments aim to raise the awareness of the community towards the challenges of replication and community sharing of NLP systems.
“…The results obtained were be compared to those submitted by the teams using the same system. During this process, the analysts took notes on the various aspects of working with the systems (ease of installing and using, ease of understanding supplied instructions, success of the replication attempt), using a specific score sheet developed by the analysts, following some of the criteria evaluated by (Zheng et al, 2015). The score sheet comprised 10 questions addressing the experience of analysts at each stage of the experiment: system configuration, system installation, running the system, obtaining results, and overall impressions.…”
Section: Evaluation Of the Replication Experiencementioning
The scientific community is facing raising concerns about the reproducibility of research in many fields. To address this issue in Natural Language Processing, the CLEF eHealth 2016 lab offered a replication track together with the Clinical Information Extraction task. Herein, we report detailed results of the replication experiments carried out with the three systems submitted to the track. While all results were ultimately replicated, we found that the systems were poorly rated by analysts on documentation aspects such as "ease of understanding system requirements" (33%) and "provision of information while system is running" (33%). As a result, simple steps could be taken by system authors to increase the ease of replicability of their work, thereby increasing the ease of re-using the systems. Our experiments aim to raise the awareness of the community towards the challenges of replication and community sharing of NLP systems.
“…2 Two of these systems were on concept extraction and understanding, two were on medication extraction, and one was on de-identification. Zheng et al [16] describes these systems and their evaluation in detail, with one major take away that affects all NLP systems in the clinical domain: the long pipeline of preprocessing components, from tokenizers to metathesauri, that are essential to most NLP goals reduce the adoptability and portability of systems, especially if the systems are to be used by novices. While these preprocessing components cannot be excluded from NLP systems, they can be standardized in their input and output formats to allow some degree of interchangeability so that each new system does not come with a completely new set of preprocessing components.…”
“…The software usability track aimed to assess the usability of systems developed for any of the past i2b2 shared tasks since 2006 [16]. The novel data use track, on the other hand, built on the observation that past i2b2 corpora have often been successfully put to use for purposes outside of their original goals and opened the 2014 shared-task corpus to any research project that fit the participants’ existing goals.…”
mentioning
confidence: 99%
“…
2
The remaining three were dropped because “one was withdrawn before the evaluations started, a second was not an NLP system, and the third was a software library that did not have a user interface.” [16]
…”
“…Much of the clinical data in electronic health records (EHRs) are represented as free text. Although progress is being made in the conversion of free text into structured data by natural language processing (NLP), these methods are not in general use [6][7][8][9][10]. The entry of data about neurological patients in EHRs into large databases requires a method for converting symptoms (patient complaints) and signs (examination abnormalities) into machine-readable codes.…”
Background: The use of clinical data in electronic health records for machine-learning or data analytics depends on the conversion of free text into machine-readable codes. We have examined the feasibility of capturing the neurological examination as machine-readable codes based on UMLS Metathesaurus concepts. Methods: We created a target ontology for capturing the neurological examination using 1100 concepts from the UMLS Metathesaurus. We created a dataset of 2386 test-phrases based on 419 published neurological cases. We then mapped the test-phrases to the target ontology. Results: We were able to map all of the 2386 test-phrases to 601 unique UMLS concepts. A neurological examination ontology with 1100 concepts has sufficient breadth and depth of coverage to encode all of the neurologic concepts derived from the 419 test cases. Using only pre-coordinated concepts, component ontologies of the UMLS, such as HPO, SNOMED CT, and OMIM, do not have adequate depth and breadth of coverage to encode the complexity of the neurological examination. Conclusion: An ontology based on a subset of UMLS has sufficient breadth and depth of coverage to convert deficits from the neurological examination into machine-readable codes using pre-coordinated concepts. The use of a small subset of UMLS concepts for a neurological examination ontology offers the advantage of improved manageability as well as the opportunity to curate the hierarchy and subsumption relationships.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.