Summary
Restriction site‐associated DNA sequencing (RAD‐seq) provides high‐resolution population genomic data at low cost, and has become an important component in ecological and evolutionary studies. As with all high‐throughput technologies, analytic strategies require critical validation to ensure precise and unbiased interpretation.
To test the impact of bioinformatic data processing on downstream population genetic inferences, we analysed mammalian RAD‐seq data (>100 individuals) with 312 combinations of methodology (de novo vs. mapping to references of increasing divergence) and filtering criteria (missing data, HWE, FIS, coverage, mapping and genotype quality). In an effort to identify commonalities and biases in all pipelines, we computed summary statistics (nr. loci, nr. SNP, π, Hetobs, FIS, FST, Ne and m) and compared the results to independent null expectations (isolation‐by‐distance correlation, expected transition‐to‐transversion ratio Ts/Tv and Mendelian mismatch rates of known parent–offspring trios).
We observed large differences between reference‐based and de novo approaches, the former generally calling more SNPs and reducing FIS and Ts/Tv. Data completion levels showed little impact on most summary statistics, and FST estimates were robust across all pipelines. The site frequency spectrum was highly sensitive to the chosen approach as reflected in large variance of parameter estimates across demographic scenarios (single‐population bottlenecks and isolation‐with‐migration model). Null expectations were best met by reference‐based approaches, although contingent on the specific criteria.
We recommend that RAD‐seq studies employ reference‐based approaches to a closely related genome, and due to the high stochasticity associated with the pipeline advocate the use of multiple pipelines to ensure robust population genetic and demographic inferences.
Glycosylation is a topic of intense current interest in the development of biopharmaceuticals because it is related to drug safety and efficacy. This work describes results of an interlaboratory study on the glycosylation of the Primary Sample (PS) of NISTmAb, a monoclonal antibody reference material. Seventy-six laboratories from industry, university, research, government, and hospital sectors in Europe, North America, Asia, and Australia submitted a total of 103 reports on glycan distributions. The principal objective of this study was to report and compare results for the full range of analytical methods presently used in the glycosylation analysis of mAbs. Therefore, participation was unrestricted, with laboratories choosing their own measurement techniques. Protein glycosylation was determined in various ways, including at the level of intact mAb, protein fragments, glycopeptides, or released glycans, using a wide variety of methods for derivatization, separation, identification, and quantification. Consequently, the diversity of results was enormous, with the number of glycan compositions identified by each laboratory ranging from 4 to 48. In total, one hundred sixteen glycan compositions were reported, of which 57 compositions could be assigned consensus abundance values. These consensus medians provide community-derived values for NISTmAb PS. Agreement with the consensus medians did not depend on the specific method or laboratory type. The study provides a view of the current state-of-the-art for biologic glycosylation measurement and suggests a clear need for harmonization of glycosylation analysis methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.