Clinical case reports are the ‘eyewitness reports’ of medicine and provide a valuable, unique, albeit noisy and underutilized type of evidence. Generally, a case report has a single main finding that represents the reason for writing up the report in the first place. However, no one has previously created an automatic way of identifying main finding sentences in case reports. We previously created a manual corpus of main finding sentences extracted from the abstracts and full text of clinical case reports. Here, we have utilized the corpus to create a machine learning-based model that automatically predicts which sentence(s) from abstracts state the main finding. The model has been evaluated on a separate manual corpus of clinical case reports and found to have good performance. This is a step toward setting up a retrieval system in which, given one case report, one can find other case reports that report the same or very similar main findings. The code and necessary files to run the main finding model can be downloaded from https://github.com/qi29/main_ finding_recognition, released under the Apache License, Version 2.0.
Clinical case reports are the `eyewitness reports’ of medicine and provide a valuable, unique, albeit noisy and underutilized type of evidence. Generally a case report has a single main finding that represents the reason for writing up the report in the first place. In the present study, we present the results of manual annotation carried out by two individuals on 500 randomly sampled case reports. This corpus contains main finding sentences extracted from title, abstract and full-text of the same article that can be regarded as semantically related and are often paraphrases. The final reconciled corpus of 416 articles comprises an open resource for further study. This is the first step in establishing text mining models and tools that can identify main finding sentences in an automated fashion, and in measuring quantitatively how similar any two main findings are. We envision that case reports in PubMed may be automatically indexed by main finding, so that users can carry out information queries for specific main findings (rather than general topics)—and given one case report, a user can retrieve those having the most similar main findings. The metric of main finding similarity may also potentially be relevant to the modeling of paraphrasing, summarization and entailment within the biomedical literature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.