No abstract
Identifying germline BRCA1/2 mutation carriers is vital for reducing their risk of breast and ovarian cancer. To derive a serum miRNA-based diagnostic test we used samples from 653 healthy women from six international cohorts, including 350 (53.6%) with BRCA1/2 mutations and 303 (46.4%) BRCA1/2 wild-type. All individuals were cancer-free before and at least 12 months after sampling. RNA-sequencing followed by differential expression analysis identified 19 miRNAs significantly associated with BRCA mutations, 10 of which were ultimately used for classification: hsa-miR-20b-5p, hsa-miR-19b-3p, hsa-let-7b-5p, hsa-miR-320b, hsa-miR-139-3p, hsa-miR-30d-5p, hsa-miR-17-5p, hsa-miR-182-5p, hsa-miR-421, hsa-miR-375-3p. The final logistic regression model achieved area under the receiver operating characteristic curve 0.89 (95% CI: 0.87–0.93), 93.88% sensitivity and 80.72% specificity in an independent validation cohort. Mutated gene, menopausal status or having preemptive oophorectomy did not affect classification performance. Circulating microRNAs may be used to identify BRCA1/2 mutations in patients of high risk of cancer, offering an opportunity to reduce screening costs.
Here we introduce a new reconstruction technique for two-dimensional Bragg Scattering Tomography (BST), based on the Radon transform models of [arXiv preprint, arXiv:2004.10961 (2020]. Our method uses a combination of ideas from multibang control and microlocal analysis to construct an objective function which can regularize the BST artifacts; specifically the boundary artifacts due to sharp cutoff in sinogram space (as observed in [arXiv preprint, arXiv:2007.00208 (2020)]), and artifacts arising from approximations made in constructing the model used for inversion. We then test our algorithm in a variety of Monte Carlo (MC) simulated examples of practical interest in airport baggage screening and threat detection. The data used in our studies is generated with a novel Monte-Carlo code presented here. The model, which is available from the authors upon request, captures both the Bragg scatter effects described by BST as well as beam attenuation and Compton scatter.
Background: Cancer identification is generally framed as binary classification, normally discrimination of a control group from a single cancer group. However, such models lack any cancer-specific information, as they are only trained on one cancer type. The models fail to account for competing cancer risks. For example, an ostensibly healthy individual may have any number of different cancer types, and a tumor may originate from one of several primary sites. Pan-cancer evaluation requires a model trained on multiple cancer types, and controls, simultaneously, so that a physician can be directed to the correct area of the body for further testing. Methods: We introduce novel neural network models to address multi-cancer classification problems across several data types commonly applied in cancer prediction, including circulating miRNA expression, protein, and mRNA. In particular, we present an analysis of neural network depth and complexity, and investigate how this relates to classification performance. Comparisons of our models with state-of-the-art neural networks from the literature are also presented. Results: Our analysis evidences that shallow, feed-forward neural net architectures offer greater performance when compared to more complex deep feed-forward, Convolutional Neural Network (CNN), and Graph CNN (GCNN) architectures considered in the literature. Conclusion: The results show that multiple cancers and controls can be classified accurately using the proposed models, across a range of expression technologies in cancer prediction. Impact: This study addresses the important problem of pan-cancer classification, which is often overlooked in the literature. The promising results highlight the urgency for further research.
Background High dimensional transcriptome profiling, whether through next generation sequencing techniques or high-throughput arrays, may result in scattered variables with missing data. Data imputation is a common strategy to maximize the inclusion of samples by using statistical techniques to fill in missing values. However, many data imputation methods are cumbersome and risk introduction of systematic bias. Results We present a new data imputation method using constrained least squares and algorithms from the inverse problems literature and present applications for this technique in miRNA expression analysis. The proposed technique is shown to offer an imputation orders of magnitude faster, with greater than or equal accuracy when compared to similar methods from the literature. Conclusions This study offers a robust and efficient algorithm for data imputation, which can be used, e.g., to improve cancer prediction accuracy in the presence of missing data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.