2021
DOI: 10.1590/1984-70332021v21sa15
|View full text |Cite
|
Sign up to set email alerts
|

Technical nuances of machine learning: implementation and validation of supervised methods for genomic prediction in plant breeding

Abstract: The decision-making process in plant breeding is driven by data. The machine learning framework has powerful tools that can extract useful information from data. However, there is still a lack of understanding about the underlying algorithms of these methods, their strengths, and pitfalls. Machine learning has two main branches: supervised and unsupervised learning. In plant breeding, supervised learning is used for genomic prediction, where phenotypic traits are modeled as a function of molecular markers. The… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7

Relationship

3
4

Authors

Journals

citations
Cited by 9 publications
(10 citation statements)
references
References 68 publications
(62 reference statements)
0
10
0
Order By: Relevance
“…In addition, THGS becomes an exact method that yields unbiased estimates of genetic correlations and GEBV (section Exact THGS), and reduces the bias of estimates of heritabilities, as can be demonstrated for scenario 1 (Table 6). Matrix decomposition is also useful to analyze high-dimensional datasets with many factors ( P >> N problem), and to fit one or multiple kernels of different types within multivariate ridge regression models, for example, for modeling dominance, epistasis [37], and Gaussian or Arc-cosine relationships [21,38]. The computing costs for matrix decomposition to obtain those eigenvectors, however, may outweigh the benefits for THGS as the number of individuals and markers in the analysis increases.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…In addition, THGS becomes an exact method that yields unbiased estimates of genetic correlations and GEBV (section Exact THGS), and reduces the bias of estimates of heritabilities, as can be demonstrated for scenario 1 (Table 6). Matrix decomposition is also useful to analyze high-dimensional datasets with many factors ( P >> N problem), and to fit one or multiple kernels of different types within multivariate ridge regression models, for example, for modeling dominance, epistasis [37], and Gaussian or Arc-cosine relationships [21,38]. The computing costs for matrix decomposition to obtain those eigenvectors, however, may outweigh the benefits for THGS as the number of individuals and markers in the analysis increases.…”
Section: Discussionmentioning
confidence: 99%
“…For balanced experimental designs, when the intercept is the only fixed effect, and either a principal components [18] or eigenvector regression [19][20][21] is used, THGS is exact. This is demonstrated in Appendix 3.…”
Section: Exact Thgsmentioning
confidence: 99%
See 1 more Smart Citation
“…Whereas decision trees are poor predictors, the collective of multiple small trees generated at random provides robust predictions. Random forest can be described by the model ( Xavier, 2021 )…”
Section: Approach 2: Multiple-marker Associationmentioning
confidence: 99%
“…Another approach to overcome the dimensionality problem involves the use of kernels. Kernel methods have been used in plant and animal breeding for pedigree-based selection and genomic prediction (GP) (Xavier, 2021). GBLUP, the most popular kernel-based method, uses RR ('KRR') based on a linear kernel (VanRaden, 2008).…”
Section: Introductionmentioning
confidence: 99%