Clinical evidence for a health benefit from cardiac rehabilitation: An update

The Zika virus (ZIKV) has captured worldwide attention with the ongoing epidemic in South America and its link to severe birth defects, most notably microcephaly. ZIKV is spread to humans through a combination of vector and sexual transmission, but the relative contribution of these transmission routes to the overall epidemic remains largely unknown. Furthermore, a disparity in the reported number of infections between males and females has been observed. We develop a mathematical model that describes the transmission dynamics of ZIKV to determine the processes driving the observed epidemic patterns. Our model reveals a 4.8% contribution of sexual transmission to the basic reproductive number, R. This contribution is too minor to independently sustain an outbreak but suggests that vector transmission is the main driver of the ongoing epidemic. We also find a minor, yet statistically significant, difference in the mean number of cases in males and females, both at the peak of the epidemic and at equilibrium. While this suggests an intrinsic disparity between males and females, the differences do not account for the vastly greater number of reported cases for females, indicative of a large reporting bias. In addition, we identify conditions under which sexual transmission may play a key role in sparking an epidemic, including temperate areas where ZIKV mosquito vectors are less prevalent.

show abstract

Inference after latent variable estimation for single-cell RNA sequencing data

Neufeld

Gao

Popp

et al. 2022

View full text Add to dashboard Cite

Summary In the analysis of single-cell RNA sequencing data, researchers often characterize the variation between cells by estimating a latent variable, such as cell type or pseudotime, representing some aspect of the cell’s state. They then test each gene for association with the estimated latent variable. If the same data are used for both of these steps, then standard methods for computing p-values in the second step will fail to achieve statistical guarantees such as Type 1 error control. Furthermore, approaches such as sample splitting that can be applied to solve similar problems in other settings are not applicable in this context. In this article, we introduce count splitting, a flexible framework that allows us to carry out valid inference in this setting, for virtually any latent variable estimation technique and inference approach, under a Poisson assumption. We demonstrate the Type 1 error control and power of count splitting in a simulation study and apply count splitting to a data set of pluripotent stem cells differentiating to cardiomyocytes.

show abstract

Food sources of energy and nutrients among Canadian adults following a gluten-free diet

Jamieson¹,

Neufeld²

2020

View full text Add to dashboard Cite

Background The gluten-free diet (GFD) involves the elimination of wheat and related grains. Wheat is a key fortification vehicle for nutrients such as iron and B vitamins. While there is growing evidence of low nutrients intake and poor diet quality amongst people following long-term GFD, few studies have used a dietary pattern approach to analyse top food sources of nutrients in today’s complex food environment. Thus, the purpose of this study was to identify food sources of energy and nutrients from previously collected diet records of adults following a GFD. Methods Three, 3-day food records were collected from 35 participants in a lifestyle intervention study (n = 240 records). All food items were categorised according to the Bureau of Nutritional Sciences Food Group Codes. Percentages of total dietary intakes from food groups were ranked. Results Mean intakes of dietary fibre, calcium and iron (females) were lower than recommended, with half the sample consuming below the recommended proportion of energy as carbohydrate. Meat, poultry and fish were the top source of energy (19.5%) in the diet. Gluten-free (GF) grain products were the top source of carbohydrate, fibre and iron and second greatest source of energy. Amongst grains, breakfast/hot cereals, yeast breads, and mixed grain dishes were the greatest nutrient contributors, despite most commercial cereals and breads (65%) being unenriched. Legumes were not frequently consumed. Conclusions GF grains were the top food source of carbohydrate, fibre and iron, despite few brands being enriched or fortified. It is a challenge to assess and monitor nutrient intakes on GFD due to the lack of nutrient composition data for B vitamins and minerals (other than iron). Dietary planning guidance for the appropriate replacement of nutrients provided by wheat is warranted.

show abstract

Tree-Values: selective inference for regression trees

Neufeld¹,

Gao²,

Witten³

2021

Preprint

View full text Add to dashboard Cite

We consider conducting inference on the output of the Classification and Regression Tree (CART) [Breiman et al., 1984] algorithm. A naive approach to inference that does not account for the fact that the tree was estimated from the data will not achieve standard guarantees, such as Type 1 error rate control and nominal coverage. Thus, we propose a selective inference framework for conducting inference on a fitted CART tree. In a nutshell, we condition on the fact that the tree was estimated from the data.We propose a test for the difference in the mean response between a pair of terminal nodes that controls the selective Type 1 error rate, and a confidence interval for the mean response within a single terminal node that attains the nominal selective coverage.Efficient algorithms for computing the necessary conditioning sets are provided. We apply these methods in simulation and to a dataset involving the association between portion control interventions and caloric intake.

show abstract

Inference after latent variable estimation for single-cell RNA sequencing data

Neufeld¹,

Gao²,

Popp³

et al. 2022

Preprint

View full text Add to dashboard Cite

In the analysis of single-cell RNA sequencing data, researchers often characterize the variation between cells by estimating a latent variable, such as cell type or pseudotime, representing some aspect of the individual cell's state. They then test each gene for association with the estimated latent variable. If the same data are used for both of these steps, then standard methods for computing p-values and confidence intervals in the second step will fail to achieve statistical guarantees such as Type 1 error control. Furthermore, approaches such as sample splitting that can be applied to solve similar problems in other settings are not applicable in this context. In this paper, we introduce count splitting, a flexible framework that allows us to carry out valid inference in this setting, for virtually any latent variable estimation technique and inference approach, under a Poisson assumption. We demonstrate the Type 1 error control and power of count splitting in a simulation study, and apply count splitting to a dataset of pluripotent stem 1

show abstract

Highly Parallel Tissue Grafting for Combinatorial In Vivo Screening

O’Connor

Neufeld

Fortin

et al. 2023

Preprint

View full text Add to dashboard Cite

Material- and cell-based technologies such as engineered tissues hold great promise as human therapies. Yet, the development of many of these technologies becomes stalled at the stage of pre-clinical animal studies due to the tedious and low-throughput nature of in vivo implantation experiments. We introduce a plug-and-play in vivo screening array platform called Highly Parallel Tissue Grafting (HPTG). HPTG enables parallelized in vivo screening of 43 three dimensional microtissues within a single 3D printed device. Using HPTG, we screen microtissue formations with varying cellular and material components and identify formulations that support vascular self-assembly, integration and tissue function. Our studies highlight the importance of combinatorial studies that vary cellular and material formulation variables concomitantly, by revealing that inclusion of stromal cells can rescue vascular self-assembly in a manner that is material-dependent. HPTG provides a route for accelerating pre-clinical progress for diverse medical applications including tissue therapy, cancer biomedicine, and regenerative medicine.

show abstract

Generalized Data Thinning Using Sufficient Statistics

Dharamshi¹,

Neufeld²,

Motwani³

et al. 2023

Preprint

View full text Add to dashboard Cite

Our goal is to develop a general strategy to decompose a random variable X into multiple independent random variables, without sacrificing any information about unknown parameters. A recent paper showed that for some well-known natural exponential families, X can be thinned into independent random variables X (1) , . . . , X (K) , such that X = K k=1 X (k) . In this paper, we generalize their procedure by relaxing this summation requirement and simply asking that some known function of the independent random variables exactly reconstruct X. This generalization of the procedure serves two purposes. First, it greatly expands the families of distributions for which thinning can be performed. Second, it unifies sample splitting and data thinning, which on the surface seem to be very different, as applications of the same principle. This shared principle is sufficiency. We use this insight to perform generalized thinning operations for a diverse set of families.

show abstract

Discussion of Breiman's "Two Cultures": From Two Cultures to One

Neufeld¹,

Witten²

2021

Observational Studies

View full text Add to dashboard Cite

We argue that algorithmic models, though powerful and appropriate in some circumstances, rely on just as many tenuous assumptions as parametric probabilistic models; these assumptions, their violations, and the ethical consequences of these violations are simply obscured within a black box. We advocate for a future in which statisticians play a central role in bridging the gap between Breiman's two cultures.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Anna Neufeld

Zika virus dynamics: When does sexual transmission matter?

Inference after latent variable estimation for single-cell RNA sequencing data

Food sources of energy and nutrients among Canadian adults following a gluten-free diet

Tree-Values: selective inference for regression trees

Inference after latent variable estimation for single-cell RNA sequencing data

Highly Parallel Tissue Grafting for Combinatorial In Vivo Screening

Generalized Data Thinning Using Sufficient Statistics

Discussion of Breiman's "Two Cultures": From Two Cultures to One

Contact Info

Product

Resources

About