Weverton Gomes da Costa scite author profile

Responsible for approximately 70% of the world's coffee exports, Brazil is increasingly concerned about the quality of the coffees produced, given the growing demand for so-called specialty coffees. With this, the breeders need, besides the agronomic characteristics, to consider the physical and sensorial quality of the beans in the breeding programs. However, the greater the number of characteristics to be considered in the selection process, the higher the difficulty in selecting superior genotypes. In this context, multivariate analyzes can help to overcome this problem. In the light of the facts, the objective was to select Coffea arabica genotypes with a high simultaneous potential of variables of commercial interest, in three municipalities belonging to the Matas de Minas region-MG, Brazil, through factor analysis, using their scores as criteria or indices of selection for genotype identification. Multivariate analyzes were performed for each environment individually and, by commonality, three factors were established for each environment. The factors were interpreted as sensorial quality, sieve and vigor, in a similar way in the three environments. The interaction genotype by environment was maintained even after the summary of the variables in factorial complexes. The genotypes Catucaí Amarelo 24/137 and H419-3-3-7-16-4-1 excelled in relation to the factorial complexes, besides showing good adaptability and stability, consequently, they present great potential to improve the coffee production performance in the region of Matas de Minas.

show abstract

Genome‐enabled prediction through machine learning methods considering different levels of trait complexity

Barbosa

Silva

Costa

et al. 2021

Crop Science

View full text Add to dashboard Cite

Genomic‐wide selection (GWS) consists of the use of a large number of molecular markers for the prediction of genetic values and has been shown to be highly relevant for genetic improvement. The objective of this work was to evaluate and compare the predictive performance of statistical (ridge regression‐best linear unbiased predictor [RR‐BLUP] and BayesB) and machine learning methods through GWS in simulated populations with traits presenting different levels of heritability and quantitative trait loci (QTL) numbers in the presence of dominant and epistatic effects. The simulated genome of population F2 was formed by 1,000 individuals and genotyped with 2,010 single nucleotide polymorphism (SNP) markers. Twenty‐six traits were simulated considering QTL numbers ranging from two to 88 and heritabilities of .3 and .6. The selective and predictive performances were evaluated using the multilayer perceptron (MLP), radial basis function (RBF), decision trees (DT), bagging (BA), random forest (RF), and boosting (BO) machine learning models and the classical RR‐BLUP and BayesB methods. A high effect of heritability was observed for the results of selective accuracy when compared to the increased QTL number. In addition, the selective accuracy based on the number of QTL demonstrates that the application of alternative machine learning models, such as RBF, BA, BO, and RF, can be suitable for the analysis according to QTL number. Machine learning methods are powerful tools for predicting genetic values with epistatic gene control in traits with different degrees of heritability and different numbers of controlling genes.

show abstract

Machine learning and statistics to qualify environments through multi-traits in Coffea arabica

et al. 2021

View full text Add to dashboard Cite

Several factors such as genotype, environment, and post-harvest processing can affect the responses of important traits in the coffee production chain. Determining the influence of these factors is of great relevance, as they can be indicators of the characteristics of the coffee produced. The most efficient models choice to be applied should take into account the variety of information and the particularities of each biological material. This study was developed to evaluate statistical and machine learning models that would better discriminate environments through multi-traits of coffee genotypes and identify the main agronomic and beverage quality traits responsible for the variation of the environments. For that, 31 morpho-agronomic and post-harvest traits were evaluated, from field experiments installed in three municipalities in the Matas de Minas region, in the State of Minas Gerais, Brazil. Two types of post-harvest processing were evaluated: natural and pulped. The apparent error rate was estimated for each method. The Multilayer Perceptron and Radial Basis Function networks were able to discriminate the coffee samples in multi-environment more efficiently than the other methods, identifying differences in multi-traits responses according to the production sites and type of post-harvest processing. The local factors did not present specific traits that favored the severity of diseases and differentiated vegetative vigor. Sensory traits acidity and fragrance/aroma score also made little contribution to the discrimination process, indicating that acidity and fragrance/aroma are characteristic of coffee produced and all coffee samples evaluated are of the special type in the Mata of Minas region. The main traits responsible for the differentiation of production sites are plant height, fruit size, and bean production. The sensory trait "Body" is the main one to discriminate the form of post-harvest processing.

show abstract

Dynamics, diversity and experimental precision in final irrigated rice testing: a time meta-analysis

Costa

Oliveira

Cruz

et al. 2020

Crop Breed. Appl. Biotechnol.

View full text Add to dashboard Cite

The objective of this study was to investigate diversity, experimental conditions and dynamics of the Irrigated Rice Breeding Program of Minas Gerais -Brazil, by meta-analysis. The target traits were grain yield, plant height and days to flowering. Evaluations were based on estimates of 376 genotypes grown at three locations, in two final comparative trials, in 14 growing seasons. Stability of the overall averages of the traits plant height and days to flowering was observed, indicating an adequate plant height and medium cycle. High average yields (>5,000 kg ha -1 ), good experimental accuracy and genetic variability were recorded. However, the genetic variability of all traits decreased over time, indicating the need to increase genotypic diversity. The parameter estimates of the morphoagronomic traits studied in time meta-analysis indicated the dynamic nature and good accuracy of the Irrigated Rice Breeding Program of Minas Gerais.

show abstract

Genetic diversity and interaction between the maintainers of commercial soybean cultivars using self‐organizing maps

et al. 2022

View full text Add to dashboard Cite

Information on the genetic diversity of commercial cultivars is of fundamental importance for crop improvement. In addition, information about possible interactions between the maintainers developing these cultivars can help design a breeding program. The objective of this work was to study the genetic diversity of soybean [Glycine max (L.) Merr.] cultivars released in Brazil from 1998 to 2017 and compare the similarity between maintainers of these cultivars based on the phenotypic information disclosed. Data was collected on 1,587 soybean cultivars registered in the National Register of Cultivars of the Ministry of Agriculture, Livestock, and Supply belonging to 59 different maintainers. Among these cultivars, 12 agromorphological traits were evaluated. To perform the grouping and select the main discriminating traits of the cultivars, the Random Forest method was used. Multiple correspondence analysis was performed to obtain the variance explanation percentage of the traits evaluated in each dimension and coordinates among the maintainers. The maintainers were organized through Kohonen self‐organizing map (SOM) through the respective coordinates. There is a wide variety of maintainers of soybean cultivars in Brazil with different objectives for launching new cultivars. Among the maintainers, Soymax presented the most restricted genetic base. The other private companies showed more remarkable similarities and a broader genetic base. The public institution EMBRAPA presented the greatest genetic diversity in its population base. Federal University of Uberlândia and Federal University of Viçosa can be partners to launch soybean cultivars given the proximity between neurons in Kohonen analysis. The high diversity of traits shows that the genetic base of soybean in Brazil is large.

show abstract

Application of fuzzy logic for adaptability and stability studies in flood‐irrigated rice (Oryza sativa)

Silva

Costa

et al. 2021

Plant Breeding

View full text Add to dashboard Cite

The aim of this study was to use fuzzy logic as an auxiliary tool in the assessment of adaptability and stability, using grain‐yield data from flood‐irrigated rice, evaluated in different agricultural years. Eighteen rice genotypes belonging to flood‐irrigated rice breeding programme were evaluated over four agricultural years, 2012/2013 to 2015/2016, totalling 12 environments (3 sites × 4 years). The methodologies of Eberhart and Russell (1966), Lin and Binns (1988) modified by Carneiro (1998) and Centroid's were used. Fuzzy logic was applied to the results of these methodologies as a tool for interpretation and decision‐making regarding the recommendation of genotypes. Performances of the different flood‐irrigated rice genotypes were influenced by environmental conditions, thereby justifying the use of multiple adaptability and stability methodologies. The use of fuzzy logic in the selection of flood‐irrigated rice genotypes is a useful and promising tool in breeding programmes, allowing information from different parameters to be used to understand the influence of environmental variations on the performance of crop genotypes.

show abstract

Eficiência de técnicas de machine learning e de redes neurais na predição genômica e identificação de marcadores

Costa¹

View full text Add to dashboard Cite

A seleção genômica ampla (Genome Wide Selection - GWS), utiliza marcadores moleculares distribuídos ao longo de todo o genoma a fim de predizer o mérito genético de plantas e animais. Os métodos de aprendizado de máquina (ML) e redes neurais artificiais (ANN) não são parametrizados e podem desenvolver modelos mais precisos e parcimoniosos para análise de GWS. Com o intuito de avaliar diferentes métodos de ML e ANN para avaliar a predição baseada em GWS, propusemos duas questões a serem respondidas por esse projeto de pesquisa. A primeira é que métodos diferentes proporcionariam previsões diversas de acordo com a complexidade da característica analisada e a segunda seria que a identificação de marcadores associados aos QTLs (Quantitative Trait Locis), também dependeria da complexidade da característica e do método analisado. Dois artigos foram desenvolvidos para responder essas questões. No primeiro artigo, o objetivo foi avaliar a precisão geral e a variabilidade do desempenho de predição de métodos baseados em ML (Decision Tree, Boosting, Bagging, Random Forest e MARS - Multivariate Adaptive Regression Splines) e ANN (Multilayer Perceptron, Radial Basis Function) comparadas ao G-BLUP em análises de predição genômica para características simuladas com diferentes números de genes na presença de epistasia e com diferentes graus de herdabilidades. No segundo artigo, o objetivo foi avaliar os métodos na associação de marcadores importantes identificados com as regiões de presença do QTLs, por meio do conjunto de dados simulado, considerando características com diferentes números de genes na presença de epistasia e de diferentes herdabilidade. Uma população F 2 em equilíbrio de Hardy-Weinberg foi simulada, constituída por 1000 indivíduos e 10 grupos de ligação de 200 cM, cada, correspondendo a 4010 SNP (Single Nucleotide Polymorphism). Na predição, o aumento no número de QTL, beneficiou principalmente os métodos de redes neurais e o G-BLUP para R² e REQM. Para os demais métodos, nos cenários de 40 QTLs ou mais, o aumento do número de QTLs afetou positivamente os resultados dos parâmetros avaliados. A variação na herdabilidade provocou efeito inverso nos valores de R² e REQM. Os métodos MARS não aditivos apresentaram R² alto para caracteres oligogênicas e para características poligênicas com alta herdabilidade e com 240 QTLs ou mais. Com relação a identificação de marcadores associados aos QTLs, a maioria dos métodos apresentaram maior índice de acertos na identificação dos marcadores em cenários com menor número de QTLs e com maior herdabilidade. A MARS 3 e o Boosting apresentaram alta capacidade de identificar os marcadores de importância, considerando as regiões associadas aos QTLs. O maior índice de erros também ocorreu em cenários com menor número de QTLs, mas com menor herdabilidade. A herdabilidade afetou positivamente o índice relativo na identificação dos marcadores associados aos QTLs. Nos cenários de 40 QTLs ou mais, o aumento do número de QTLs também afetou positivamente o índice relativo para a maioria dos métodos. Contudo, os melhores resultados foram encontrados para o cenário com maior herdabilidade e com 8 QTLs. Os métodos MARS 1, MARS 2, Boosting e Bagging foram os mais efetivos na detecção de marcadores importantes ao longo do genoma, principalmente para as características com 8 e 240 QTLs. A variação na herdabilidade e no número de QTLs impactou o desempenho dos métodos tanto para predição quanto para identificação dos marcadores associados a QTLs. Assim, a distribuição dos QTL nos grupos de ligação pode ser o principal atributo a ser avaliado na predição dos valores genéticos e identificação de marcas associadas à QTLs, quando o experimento é bem conduzido a fim de se obter um maior valor para a herdabilidade. Os métodos de ML e de ANN demonstraram alto potencial para predição de valores genéticos em caracteres com efeitos dominantes e epistáticos. Já para a identificação de marcadores associados às regiões de presença de QTLs, os métodos de aprendizado de máquinas são mais eficientes. O uso de diferentes métodos estatísticos, redes neurais e aprendizado de máquina resultou em diferentes consequências influenciadas pela complexidade e particularidade das características analisadas. Portanto, recomenda-se que ao avaliar a predição de valores genéticos e a importância de marcadores, o uso de múltiplas abordagens seja utilizado, a fim de escolher o melhor método a ser utilizado. Palavras-chave: Inteligência artificial. Seleção Genômica ampla. Importância de variáveis. Característica Quantitativa.

show abstract

Genomic prediction through machine learning and neural networks for traits with epistasis

Costa

Celeri²,

Barbosa³

et al. 2022

Computational and Structural Biotechnology Journal

View full text Add to dashboard Cite

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.