Background The genetic control of sex determination in teleost species is poorly understood. This is partly because of the diversity of mechanisms that determine sex in this large group of vertebrates, including constitutive genes linked to sex chromosomes, polygenic constitutive mechanisms, environmental factors, hermaphroditism, and unisexuality. Here we use a de novo genome assembly of New Zealand silver trevally (Pseudocaranx georgianus) together with sex-specific whole genome sequencing data to detect sexually divergent genomic regions, identify candidate genes and develop molecular makers. Results The de novo assembly of an unsexed trevally (Trevally_v1) resulted in a final assembly of 579.4 Mb in length, with a N50 of 25.2 Mb. Of the assembled scaffolds, 24 were of chromosome scale, ranging from 11 to 31 Mb in length. A total of 28,416 genes were annotated after 12.8 % of the assembly was masked with repetitive elements. Whole genome re-sequencing of 13 wild sexed trevally (seven males and six females) identified two sexually divergent regions located on two scaffolds, including a 6 kb region at the proximal end of chromosome 21. Blast analyses revealed similarity between one region and the aromatase genes cyp19 (a1a/b) (E-value < 1.00E-25, identity > 78.8 %). Males contained higher numbers of heterozygous variants in both regions, while females showed regions of very low read-depth, indicative of male-specificity of this genomic region. Molecular markers were developed and subsequently tested on 96 histologically-sexed fish (42 males and 54 females). Three markers amplified in absolute correspondence with sex (positive in males, negative in females). Conclusions The higher number of heterozygous variants in males combined with the absence of these regions in females support a XY sex-determination model, indicating that the trevally_v1 genome assembly was developed from a male specimen. This sex system contrasts with the ZW sex-determination model documented in closely related carangid species. Our results indicate a sex-determining function of a cyp19a1a-like gene, suggesting the molecular pathway of sex determination is somewhat conserved in this family. The genomic resources developed here will facilitate future comparative work, and enable improved insights into the varied sex determination pathways in teleosts. The sex marker developed in this study will be a valuable resource for aquaculture selective breeding programmes, and for determining sex ratios in wild populations.
Background: The genetic control of sex determinism in teleost species is poorly understood. This is partly because of the diversity of sex determining mechanisms in this large group, including constitutive genes linked to sex chromosomes, polygenic constitutive mechanisms, environmental factors, hermaphroditism, and unisexuality. Here we use a de novo genome assembly of New Zealand silver trevally (Pseudocaranx georgianus) together with whole genome sequencing to detect sexually divergent regions, identify candidate genes and develop molecular makers. Results: The de novo assembly of an unsexed trevally (Trevally_v1) resulted in an assembly of 579.4 Mb in length, with a N50 of 25.2 Mb. Of the assembled scaffolds, 24 were of chromosome scale, ranging from 11 to 31 Mb. A total of 28416 genes were annotated after 12.8% of the assembly was masked with repetitive elements. Whole genome re-sequencing of 13 sexed trevally (7 males, 6 females) identified sexually divergent regions located on two scaffolds, including a 6 kb region at the proximal end of chromosome 21. Blast analyses revealed similarity between one region and the aromatase genes cyp19 (a1a/b). Males contained higher numbers of heterozygous variants in both regions, while females showed regions of very low read-depth, indicative of deletions. Molecular markers tested on 96 histologically-sexed fish (42 males, 54 females). Three markers amplified in absolute correspondence with sex. Conclusions: The higher number of heterozygous variants in males combined with deletions in females support a XY sex-determination model, indicating the trevally_v1 genome assembly was based on a male. This sex system contrasts with the ZW-type sex system documented in closely related species. Our results indicate a likely sex-determining function of the cyp19b-like gene, suggesting the molecular pathway of sex determination is somewhat conserved in this family. Our genomic resources will facilitate future comparative genomics works in teleost species, and enable improved insights into the varied sex determination pathways in this group of vertebrates. The sex marker will be a valuable resource for aquaculture breeding programmes, and for determining sex ratios and sex-specific impacts in wild fisheries stocks of this species.
Background: Genetic diversity provides the basic substrate for evolution. Genetic variation consists of changes ranging from single base pairs (single-nucleotide polymorphisms, or SNPs) to larger-scale structural variants, such as inversions, deletions, and duplications. SNPs have long been used as the general currency for investigations into how genetic diversity fuels evolution. However, structural variants can affect more base pairs in the genome than SNPs and can be responsible for adaptive phenotypes due to their impact on linkage and recombination. In this study, we investigate the first steps needed to explore the genetic basis of an economically important growth trait in the marine teleost finfish Chrysophrys auratus using both SNP and structural variant data. Specifically, we use feature selection methods in machine learning to explore the relative predictive power of both types of genetic variants in explaining growth and discuss the feature selection results of the evaluated methods. Methods: SNP and structural variant callers were used to generate catalogues of variant data from 32 individual fish at ages 1 and 3 years. Three feature selection algorithms (ReliefF, Chi-square, and a mutual-information-based method) were used to reduce the dataset by selecting the most informative features. Following this selection process, the subset of variants was used as features to classify fish into small, medium, or large size categories using KNN, naïve Bayes, random forest, and logistic regression. The top-scoring features in each feature selection method were subsequently mapped to annotated genomic regions in the zebrafish genome, and a permutation test was conducted to see if the number of mapped regions was greater than when random sampling was applied. Results: Without feature selection, the prediction accuracies ranged from 0 to 0.5 for both structural variants and SNPs. Following feature selection, the prediction accuracy increased only slightly to between 0 and 0.65 for structural variants and between 0 and 0.75 for SNPs. The highest prediction accuracy for the logistic regression was achieved for age 3 fish using SNPs, although generally predictions for age 1 and 3 fish were very similar (ranging from 0–0.65 for both SNPs and structural variants). The Chi-square feature selection of SNP data was the only method that had a significantly higher number of matches to annotated genomic regions of zebrafish than would be explained by chance alone. Conclusions: Predicting a complex polygenic trait such as growth using data collected from a low number of individuals remains challenging. While we demonstrate that both SNPs and structural variants provide important information to help understand the genetic basis of phenotypic traits such as fish growth, the full complexities that exist within a genome cannot be easily captured by classical machine learning techniques. When using high-dimensional data, feature selection shows some increase in the prediction accuracy of classification models and provides the potential to identify unknown genomic correlates with growth. Our results show that both SNPs and structural variants significantly impact growth, and we therefore recommend that researchers interested in the genotype–phenotype map should strive to go beyond SNPs and incorporate structural variants in their studies as well. We discuss how our machine learning models can be further expanded to serve as a test bed to inform evolutionary studies and the applied management of species.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.