2020
DOI: 10.1101/2020.02.18.949735
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Exon probe sets and bioinformatics pipelines for all levels of fish phylogenomics

Abstract: 24Exon markers have a long history of use in phylogenetics of ray-finned fishes, the most diverse 25 clade of vertebrates with more than 35,000 species. As the number of published genomes 26 increases, it has become easier to test exons and other genetic markers for signals of ancient 27 duplication events and filter out paralogs that can mislead phylogenetic analysis. We present 28 seven new probe sets for current target-capture phylogenomic protocols that capture 1,104 exons 29 explicitly filtered for paralo… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
12
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 8 publications
(12 citation statements)
references
References 71 publications
0
12
0
Order By: Relevance
“…Although NGS and sequence capture approaches continue to supplant Sanger‐sequencing approaches based on one to a few loci (herein referred to as ‘legacy loci’), there has been recent interest in exploring how legacy loci can complement sequence capture data (e.g. Blaimer et al., 2015; Branstetter et al., 2017, 2021; Derkarabetian et al., 2019; Hughes et al., 2021; Simon et al., 2019; Zhang et al., 2019). Specifically, the generation of more comprehensive data sets may increase resolution power for phylogenetic inference and allow for the inclusion of rare and vital species that may be difficult to sample repeatedly (Branstetter et al., 2017; Derkarabetian et al., 2019; Zhang, Williams, et al., 2019).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Although NGS and sequence capture approaches continue to supplant Sanger‐sequencing approaches based on one to a few loci (herein referred to as ‘legacy loci’), there has been recent interest in exploring how legacy loci can complement sequence capture data (e.g. Blaimer et al., 2015; Branstetter et al., 2017, 2021; Derkarabetian et al., 2019; Hughes et al., 2021; Simon et al., 2019; Zhang et al., 2019). Specifically, the generation of more comprehensive data sets may increase resolution power for phylogenetic inference and allow for the inclusion of rare and vital species that may be difficult to sample repeatedly (Branstetter et al., 2017; Derkarabetian et al., 2019; Zhang, Williams, et al., 2019).…”
Section: Introductionmentioning
confidence: 99%
“…One approach to integrating legacy loci with sequence capture data involves designing capture baits of legacy loci for inclusion in existing sequence capture bait kits, such as optimized UCE and exon‐capture bait sets (Branstetter et al., 2017; Hughes et al., 2021; Simon et al., 2019). However, this approach may increase the cost of generating custom probe kits and may require more baits across more species due to higher rates of substitutions in some legacy loci (particularly mitochondrial DNA [mtDNA]).…”
Section: Introductionmentioning
confidence: 99%
“…High-quality DNA extractions were sent to Arbor Biosciences for target enrichment and sequencing. Our target capture probes are based on a set of 1,104 single-copy exons optimized for rayfinned fish phylogenetics (26,27). We also included 15 legacy exons into the probe set.…”
Section: Methodsmentioning
confidence: 99%
“…Extended results are reported in the SI Appendix. Using exon capture approaches (26,27), we assembled two main phylogenomic data matrices: 1) an expanded supermatrix that includes all genes and taxa sequenced for this study, with the addition of GenBank sequences aimed at increasing taxonomic coverage for downstream comparative analyses (1,115 exons and 474,132 nucleotide sites for 110 out of ca. 136 species; 37% missing cells), and 2) a reduced (phylogenomic-only) matrix obtained with a matrix reduction algorithm, used to assess the sensitivity of phylogenomic results to missing data (1,047 exons and 448,410 nucleotide sites for 84 species; 16% missing cells).…”
Section: Phylogenomic Inference and Tree Uncertainty In Comparativementioning
confidence: 99%
“…We used the bioinformatics pipeline optimized by Hughes et al (2020) to obtain sequence alignments for 951 exon markers from an initial set of 1051 (S1 appendix). Raw FASTQ files were trimmed with Trimmomatic v0.36 (Bolger et al 2014), to remove low quality sequences and adapter contamination.…”
Section: Methodsmentioning
confidence: 99%