DATA MINING-BASED TECHNIQUE ON SHEEP BREED CERTIFICATIONABSTRACT: This study aimed at developing a method based on data mining techniques to select key SNP markers (Single Nucleotide Polymorphism) for the sheep breeds Crioula, Morada Nova and Santa Inês. We gathered data from the International Sheep Consortium of 72 animals belonging to the aforementioned breeds; each animal has 49,034 SNP markers. Whereas the number of attributes (markers) is much greater than observations (animals), the LASSO (Least Absolute Shrinkage and Selection Operator), Random Forest and Boosting prediction methods were used to generate predictive models, incorporating selection methods and attributes. The results revealed that the predictive models selected the main SNP markers for sheep breed identification. The LASSO technique selected 29 relevant markers. Yet from Random Forest and Boosting selected 27 and 20 major markers, respectively. By intersecting the generated models, we could identify a subset of 18 markers with major potential for sheep breed identification.KEYWORDS: single-nucleotide polymorphism, feature selection, predictive modeling, penalized regression. INTRODUÇÃOO Brasil possui diversas raças de ovinos que se desenvolveram a partir de raças trazidas pelos colonizadores e que adquiriram características específicas de adaptação às condições ambienta is brasileiras. Essas raças passaram a ser conhecidas como locais ou localmente adaptadas. A maioria delas encontra-se ameaçada de extinção, principalmente devido a cruzamentos indiscriminados com
An essential step in the development of products based on genetically modified plants (GMPs) is an assessment of safety, including an evaluation of the potential impact of the crop and practices related to its cultivation on the environment and human or animal health. The purpose of this safety assessment is to compare information about the GMP with that from a non-GM crop. However, at present this risk analysis may be faulty because there is no widely accepted and specific risk assessment method to evaluate GMPs that uses quantifiable parameters and allows for a comparative analysis among different technologies. This paper introduces a risk analysis method that focuses on the identification and evaluation of risks associated with the field release and cultivation of GMPs. Two tools bolster this proposed risk assessment method: (1) worksheets to compile Evidence of Risks, and (2) a Matrix of Assessment. The first tool identifies potential hazards related to the use of a specific GMP. This preformatted worksheet assigns values to the level of risk and its significance in terms of the activity to be developed. The second tool provides a structure to observe the potential hazards that illustrates what approach supports the use of GMPs in a manner as safe as any other traditional technology. To better understand this proposed risk assessment method, it is presented in a digital format (www.cnpma.embrapa.br/forms/gmp_ram.php3) where the two tools are linked so that the user can fill in the worksheets and automatically observe the results in the matrix. Compared to current processes, this proposed method represents a less subjective and more transparent process for risk assessment.
The objective of this work was to evaluate the usefulness of a subset of 18 single nucleotide polymorphisms (SNPs) for breed identification of Brazilian Crioula, Morada Nova (MN), and Santa Inês (SI) sheep. Data of 588 animals were analyzed with the Structure software. Assignments higher than 90% confidence were observed in 82% of the studied samples. Most of the low-value assignments were observed in MN and SI breeds. Therefore, although there is a high reliability in this subset of 18 SNPs, it is not enough for an unequivocal assignment of the studied breeds, mainly of hair breeds. A more precise panel still needs to be developed for the widespread use in breed assignment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.