Amino acid mutations that lower a protein's thermodynamic stability are implicated in numerous diseases, and engineered proteins with enhanced stability are important in research and medicine. Computational methods for predicting how mutations perturb protein stability are therefore of great interest. Despite recent advancements in protein design using deep learning, in silico prediction of stability changes has remained challenging, in part due to a lack of large, high-quality training datasets for model development. Here we introduce ThermoMPNN, a deep neural network trained to predict stability changes for protein point mutations given an initial structure. In doing so, we demonstrate the utility of a newly released mega-scale stability dataset for training a robust stability model. We also employ transfer learning to leverage a second, larger dataset by using learned features extracted from a deep neural network trained to predict a protein's amino acid sequence given its three-dimensional structure. We show that our method achieves competitive performance on established benchmark datasets using a lightweight model architecture that allows for rapid, scalable predictions. Finally, we make ThermoMPNN readily available as a tool for stability prediction and design.
Objectives: Automated whole brain segmentation from magnetic resonance images is of great interest for the development of clinically relevant volumetric markers for various neurological diseases.Although deep learning methods have demonstrated remarkable potential in this area, they may perform poorly in nonoptimal conditions, such as limited training data availability. Manual whole brain segmentation is an incredibly tedious process, so minimizing the data set size required for training segmentation algorithms may be of wide interest. The purpose of this study was to compare the performance of the prototypical deep learning segmentation architecture (U-Net) with a previously published atlas-free traditional machine learning method, Classification using Derivative-based Features (C-DEF) for whole brain segmentation, in the setting of limited training data.Materials and Methods: C-DEF and U-Net models were evaluated after training on manually curated data from 5, 10, and 15 participants in 2 research cohorts: (1) people living with clinically diagnosed HIV infection and (2) relapsing-remitting multiple sclerosis, each acquired at separate institutions, and between 5 and 295 participants' data using a large, publicly available, and annotated data set of glioblastoma and lower grade glioma (brain tumor segmentation). Statistics was performed on the Dice similarity coefficient using repeated-measures analysis of variance and Dunnett-Hsu pairwise comparison.Results: C-DEF produced better segmentation than U-Net in lesion (29.2%-38.9%) and cerebrospinal fluid (5.3%-11.9%) classes when trained with data from 15 or fewer participants. Unlike C-DEF, U-Net showed significant improvement when increasing the size of the training data (24%-30% higher than baseline). In the brain tumor segmentation data set, C-DEF produced equivalent or better segmentations than U-Net for enhancing tumor and peritumoral edema regions across all training data sizes explored. However, U-Net was more effective than C-DEF for segmentation of necrotic/non-enhancing tumor when trained on 10 or more participants, probably because of the inconsistent signal intensity of the tissue class.Conclusions: These results demonstrate that classical machine learning methods can produce more accurate brain segmentation than the far more complex deep learning methods when only small or moderate amounts of training data are available (n # 15). The magnitude of this advantage varies by tissue and cohort, while U-Net may be preferable for deep gray matter and necrotic/ non-enhancing tumor segmentation, particularly with larger training data sets (n $ 20). Given that segmentation models often need to be retrained for application to novel imaging protocols or pathology, the bottleneck associated with large-scale manual annotation could be avoided with classical machine learning algorithms, such as C-DEF.
Introduction: Automatic whole brain and lesion segmentation at 7T presents challenges, primarily from bias fields and susceptibility artifacts. Recent advances in segmentation methods, namely using atlas-free and multi-contrast (for example, using T1-weighted, T2-weighted, fluid attenuated inversion recovery or FLAIR images) can enhance segmentation performance, how-ever perfect registration at high fields remain a challenge primarily from distortion effects. We sought to use deep-learning algorithms (D/L) to do both skull stripping and whole brain segmentation on multiple imaging contrasts generated in a single Magnetization Prepared 2 Rapid Acquisition Gradient Echoes (MP2RAGE) acquisition on participants clinically diagnosed with multiple sclerosis (MS). The segmentation results were compared to that from 3T images acquired on the same participants, and with commonly available software packages. Finally, we explored ways to boost the performance of the D/L by using pseudo-labels generated from trainings on the 3T data (transfer learning). Methods: 3T and 7T MRI acquired within 9 months of each other, from 25 study participants clinically diagnosed with multiple sclerosis (mean age 51, SD 16 years, 18 women), were retrospectively analyzed with commonly used software packages (such as FreeSurfer), Classification using Derivative-based Features (C-DEF), nnU-net (no-new-Net version of U-Net algorithm), and a novel 3T-to-7T transfer learning method, Pseudo-Label Assisted nnU-Net (PLAn). These segmentation results were then rated visually by trained experts and quantitatively in comparison with 3T label masks. Results: Of the previously published methods considered, nnU-Net produced the best skull stripping at 7T in both the qualitative and quantitative ratings followed by C-DEF 7T and Free-Surfer 7T. A similar trend was observed for tissue segmentation, as nnU-Net was again the best method at 7T for all tissue classes. Dice Similarity Coefficient (DSC) from lesions segmented with nnU-Net were 1.5 times higher than from FreeSurfer at 7T. Relative to analysis with C-DEF segmentation on 3T scans, nnU-Net 7T had lower lesion volumes, with a correlation slope of just 0.68. PLAn 7T produced equivalent results to nnU-Net 7T in terms of skull stripping and most tissue classes, but it boosted lesion sensitivity by 15% relative to 3T, in-creasing the correlation slope to 0.90. This resulted in significantly better lesion segmentations as measured by expert rating (4% increase) and Dice coefficient (6% increase). Conclusion: Deep learning methods can produce fast and reliable whole brain segmentations, including skull stripping and lesion detection, using data from a single 7T MRI sequence. While nnU-Net segmentations at 7T are superior to the other methods considered, the limited availability of labeled 7T data makes transfer learning an attractive option. In this case, pre-training a nnU-Net model using readily obtained 3T pseudo-labels was shown to boost lesion detection capabilities at 7T. This approach, which we call PLAn, is robust and readily adaptable due to its use of a single commonly gathered MRI sequence.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.