Randomized controlled trials (RCTs) represent the paramount evidence of clinical medicine. Using machines to interpret the massive amount of RCTs has the potential of aiding clinical decision-making. We propose a RCT conclusion generation task from the PubMed 200k RCT sentence classification dataset to examine the effectiveness of sequence-to-sequence models on understanding RCTs. We first build a pointergenerator baseline model for conclusion generation. Then we fine-tune the state-of-the-art GPT-2 language model, which is pre-trained with general domain data, for this new medical domain task. Both automatic and human evaluation show that our GPT-2 fine-tuned models achieve improved quality and correctness in the generated conclusions compared to the baseline pointer-generator model. Further inspection points out the limitations of this current approach and future directions to explore * .
Machine-generated citation sentences can aid automated scientific literature review and assist article writing. Current methods in generating citation text were limited to single citation generation using the citing document and a cited document as input. However, in realworld situations, writers often summarize several studies in one sentence or discuss relevant information across the entire paragraph. In addition, multiple citation intents have been previously identified, implying that writers may need control over the intents of generated sentences to cover different scenarios. Therefore, this work focuses on generating multiple citations and releasing a newly collected dataset named CiteMI to drive the future research. We first build a novel generation model with the Fusion-in-Decoder approach to cope with multiple long inputs. Second, we incorporate the predicted citation intents into training for intent control. The experiments demonstrate that the proposed approaches provide much more comprehensive features for generating citation sentences.
Universal lesion detection and tagging (ULDT) in CT studies is critical for tumor burden assessment and tracking the progression of lesion status (growth/shrinkage) over time. However, a lack of fully annotated data hinders the development of effective ULDT approaches. Prior work used the DeepLesion dataset (4,427 patients, 10,594 studies, 32,120 CT slices, 32,735 lesions, 8 body part labels) for algorithmic development, but this dataset is not completely annotated and contains class imbalances. To address these issues, in this work, we developed a self-training pipeline for ULDT. A VFNet model was trained on a limited 11.5% subset of DeepLesion (bounding boxes + tags) to detect and classify lesions in CT studies. Then, it identified and incorporated novel lesion candidates from a larger unseen data subset into its training set, and self-trained itself over multiple rounds. Multiple self-training experiments were conducted with different threshold policies to select predicted lesions with higher quality and cover the class imbalances. We discovered that direct self-training improved the sensitivities of over-represented lesion classes at the expense of under-represented classes. However, upsampling the lesions mined during self-training along with a variable threshold policy yielded a 6.5% increase in sensitivity at 4 FP in contrast to self-training without class balancing (72% vs 78.5%) and a 11.7% increase compared to the same self-training policy without upsampling (66.8% vs 78.5%). Furthermore, we show that our results either improved or maintained the sensitivity at 4FP for all 8 lesion classes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.