XiaoFei Zhao scite author profile

Researchers increasingly use meta-analysis to synthesize the results of several studies in order to estimate a common effect. When the outcome variable is continuous, standard meta-analytic approaches assume that the primary studies report the sample mean and standard deviation of the outcome. However, when the outcome is skewed, authors sometimes summarize the data by reporting the sample median and one or both of (i) the minimum and maximum values and (ii) the first and third quartiles, but do not report the mean or standard deviation. To include these studies in meta-analysis, several methods have been developed to estimate the sample mean and standard deviation from the reported summary data. A major limitation of these widely used methods is that they assume that the outcome distribution is normal, which is unlikely to be tenable for studies reporting medians. We propose two novel approaches to estimate the sample mean and standard deviation when data are suspected to be non-normal. Our simulation results and empirical assessments show that the proposed methods often perform better than the existing methods when applied to non-normal data.

show abstract

One‐sample aggregate data meta‐analysis of medians

McGrath

Zhao

Qin³

et al. 2018

Statistics in Medicine

120

141

View full text Add to dashboard Cite

An aggregate data meta‐analysis is a statistical method that pools the summary statistics of several selected studies to estimate the outcome of interest. When considering a continuous outcome, typically each study must report the same measure of the outcome variable and its spread (eg, the sample mean and its standard error). However, some studies may instead report the median along with various measures of spread. Recently, the task of incorporating medians in meta‐analysis has been achieved by estimating the sample mean and its standard error from each study that reports a median in order to meta‐analyze the means. In this paper, we propose two alternative approaches to meta‐analyze data that instead rely on medians. We systematically compare these approaches via simulation study to each other and to methods that transform the study‐specific medians and spread into sample means and their standard errors. We demonstrate that the proposed median‐based approaches perform better than the transformation‐based approaches, especially when applied to skewed data and data with high inter‐study variance. Finally, we illustrate these approaches in a meta‐analysis of patient delay in tuberculosis diagnosis.

show abstract

High-throughput identification of novel conotoxins from the Chinese tubular cone snail (Conus betulinus) by multi-transcriptome sequencing

Peng

Gao

et al. 2016

GigaSci

129

View full text Add to dashboard Cite

BackgroundThe venom of predatory marine cone snails mainly contains a diverse array of unique bioactive peptides commonly referred to as conopeptides or conotoxins. These peptides have proven to be valuable pharmacological probes and potential drugs because of their high specificity and affinity to important ion channels, receptors and transporters of the nervous system. Most previous studies have focused specifically on the conopeptides from piscivorous and molluscivorous cone snails, but little attention has been devoted to the dominant vermivorous species.ResultsThe vermivorous Chinese tubular cone snail, Conus betulinus, is the dominant Conus species inhabiting the South China Sea. The transcriptomes of venom ducts and venom bulbs from a variety of specimens of this species were sequenced using both next-generation sequencing and traditional Sanger sequencing technologies, resulting in the identification of a total of 215 distinct conopeptides. Among these, 183 were novel conopeptides, including nine new superfamilies. It appeared that most of the identified conopeptides were synthesized in the venom duct, while a handful of conopeptides were identified only in the venom bulb and at very low levels.ConclusionsWe identified 215 unique putative conopeptide transcripts from the combination of five transcriptomes and one EST sequencing dataset. Variation in conopeptides from different specimens of C. betulinus was observed, which suggested the presence of intraspecific variability in toxin production at the genetic level. These novel conopeptides provide a potentially fertile resource for the development of new pharmaceuticals, and a pathway for the discovery of new conotoxins.Electronic supplementary materialThe online version of this article (doi:10.1186/s13742-016-0122-9) contains supplementary material, which is available to authorized users.

show abstract

BinDash, software for fast genome distance estimation on a typical personal laptop

Zhao¹

2018

View full text Add to dashboard Cite

show abstract

Calling small variants using universality with Bayes-factor-adjusted odds ratios

Zhao

Wang

et al. 2021

View full text Add to dashboard Cite

The application of next-generation sequencing in research and particularly in clinical routine requires highly accurate variant calling. Here we describe UVC, a method for calling small variants of germline or somatic origin. By unifying opposite assumptions with sublation, we discovered the following two empirical laws to improve variant calling: allele fraction at high sequencing depth is inversely proportional to the cubic root of variant-calling error rate, and odds ratios adjusted with Bayes factors can model various sequencing biases. UVC outperformed other variant callers on the GIAB germline truth sets, 192 scenarios of in silico mixtures simulating 192 combinations of tumor/normal sequencing depths and tumor/normal purities, the GIAB somatic truth sets derived from physical mixture, and the SEQC2 somatic reference sets derived from the breast-cancer cell-line HCC1395. UVC achieved 100% concordance with the manual review conducted by multiple independent researchers on a Qiagen 71-gene-panel dataset derived from 16 patients with colon adenoma. UVC outperformed other unique molecular identifier (UMI)-aware variant callers on the datasets used for publishing these variant callers. Performance was measured with sensitivity-specificity trade off for called variants. The improved variant calls generated by UVC from previously published UMI-based sequencing data provided additional insight about DNA damage repair. UVC is open-sourced under the BSD 3-Clause license at https://github.com/genetronhealth/uvc and quay.io/genetronhealth/gcc-6-3-0-uvc-0-6-0-441a694

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

XiaoFei Zhao

Estimating the sample mean and standard deviation from commonly reported quantiles in meta-analysis

One‐sample aggregate data meta‐analysis of medians

High-throughput identification of novel conotoxins from the Chinese tubular cone snail (Conus betulinus) by multi-transcriptome sequencing

BinDash, software for fast genome distance estimation on a typical personal laptop

Calling small variants using universality with Bayes-factor-adjusted odds ratios

Contact Info

Product

Resources

About