The development of genomic selection (GS) methods has allowed plant breeding programs to select favorable lines using genomic data before performing field trials. Improvements in genotyping technology have yielded high-dimensional genomic marker data which can be difficult to incorporate into statistical models. In this paper, we investigated the utility of applying dimensionality reduction (DR) methods as a pre-processing step for GS methods. We compared five DR methods and studied the trend in the prediction accuracies of each method as a function of the number of features retained. The effect of DR methods was studied using three models that involved the main effects of line, environment, marker, and the genotype by environment interactions. The methods were applied on a real data set containing 315 lines phenotyped in nine environments with 26,817 markers each. Regardless of the DR method and prediction model used, only a fraction of features was sufficient to achieve maximum correlation. Our results underline the usefulness of DR methods as a key pre-processing step in GS models to improve computational efficiency in the face of ever-increasing size of genomic data.
Mycobacterium avium subsp. paratuberculosis (MAP) is the etiological agent of Johne’s disease, a severe gastroenteritis of ruminants. This study developed a model cell culture system to rapidly screen MAP mutants with vaccine potential for apoptosis. Two wild-type strains, a transposon mutant, and two deletion mutant MAP strains (MOI of 10 with 1.2 × 106 CFU) were tested in murine RAW 264.7 macrophages to determine if they induce apoptosis and/or necrosis. Both deletion mutants were previously shown to be attenuated and immunogenic in primary bovine macrophages. All strains had similar growth rates, but cell morphology indicated that both deletion mutants were elongated with cell wall bulging. Cell death kinetics were followed by a real-time cellular assay to measure luminescence (apoptosis) and fluorescence (necrosis). A 6 h infection period was the appropriate time to assess apoptosis that was followed by secondary necrosis. Apoptosis was also quantified via DAPI-stained nuclear morphology and validated via flow cytometry. The combined analysis confirmed the hypothesis that candidate vaccine deletion mutants are pro-apoptotic in RAW 264.7 cells. In conclusion, the increased apoptosis seen in the deletion mutants correlates with the attenuated phenotype and immunogenicity observed in bovine macrophages, a property associated with good vaccine candidates.
Modern plant breeding programs collect several data types such as weather, images, and secondary or associated traits besides the main trait (e.g., grain yield). Genomic data is high-dimensional and often over-crowds smaller data types when naively combined to explain the response variable. There is a need to develop methods able to effectively combine different data types of differing sizes to improve predictions. Additionally, in the face of changing climate conditions, there is a need to develop methods able to effectively combine weather information with genotype data to predict the performance of lines better. In this work, we develop a novel three-stage classifier to predict multi-class traits by combining three data types—genomic, weather, and secondary trait. The method addressed various challenges in this problem, such as confounding, differing sizes of data types, and threshold optimization. The method was examined in different settings, including binary and multi-class responses, various penalization schemes, and class balances. Then, our method was compared to standard machine learning methods such as random forests and support vector machines using various classification accuracy metrics and using model size to evaluate the sparsity of the model. The results showed that our method performed similarly to or better than machine learning methods across various settings. More importantly, the classifiers obtained were highly sparse, allowing for a straightforward interpretation of relationships between the response and the selected predictors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.