Ricardo Cao scite author profile

A completely nonparametric method for the estimation of mixture cure models is proposed in this paper. The nonparametric estimator of the incidence introduced by Xu and Peng ( 2014) is extensively studied and a nonparametric estimator of the latency is presented. These estimators, which are based on the Beran estimator of the conditional survival function, are proved to be the local maximum likelihood estimators. An iid representation is obtained for the nonparametric incidence estimator. As a consequence, an asymptotically optimal bandwidth is found. Moreover, a bootstrap bandwidth selection method for the nonparametric incidence estimator is proposed. The introduced nonparametric estimators are compared with existing semiparametric approaches in a simulation study, in which the performance of the bootstrap bandwidth selector is also assessed. Finally, the presented method is applied to a database of colorectal cancer from the University Hospital of A Coruña (CHUAC).

show abstract

Bootstrapping the Mean Integrated Squared Error

Cao

1993

Journal of Multivariate Analysis

View full text Add to dashboard Cite

Evaluating the Ability of Tree‐Based Methods and Logistic Regression for the Detection of SNP‐SNP Interaction

García-Magariños

López-de-Ullibarri

Cao

et al. 2009

Annals of Human Genetics

View full text Add to dashboard Cite

SummaryMost common human diseases are likely to have complex etiologies. Methods of analysis that allow for the phenomenon of epistasis are of growing interest in the genetic dissection of complex diseases. By allowing for epistatic interactions between potential disease loci, we may succeed in identifying genetic variants that might otherwise have remained undetected. Here we aimed to analyze the ability of logistic regression (LR) and two tree-based supervised learning methods, classification and regression trees (CART) and random forest (RF), to detect epistasis. Multifactor-dimensionality reduction (MDR) was also used for comparison. Our approach involves first the simulation of datasets of autosomal biallelic unphased and unlinked single nucleotide polymorphisms (SNPs), each containing a two-loci interaction (causal SNPs) and 98 'noise' SNPs. We modelled interactions under different scenarios of sample size, missing data, minor allele frequencies (MAF) and several penetrance models: three involving both (indistinguishable) marginal effects and interaction, and two simulating pure interaction effects. In total, we have simulated 99 different scenarios. Although CART, RF, and LR yield similar results in terms of detection of true association, CART and RF perform better than LR with respect to classification error. MAF, penetrance model, and sample size are greater determining factors than percentage of missing data in the ability of the different techniques to detect true association. In pure interaction models, only RF detects association. In conclusion, tree-based methods and LR are important statistical tools for the detection of unknown interactions among true risk-associated SNPs with marginal effects and in the presence of a significant number of noise SNPs. In pure interaction models, RF performs reasonably well in the presence of large sample sizes and low percentages of missing data. However, when the study design is suboptimal (unfavourable to detect interaction in terms of e.g. sample size and MAF) there is a high chance of detecting false, spurious associations.

show abstract

Forecasting next-day electricity demand and price using nonparametric functional methods

Vilar

Cao

Aneiros

2012

International Journal of Electrical Power & Energy Systems

121

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ricardo Cao

A comparative study of several smoothing methods in density estimation

Nonparametric incidence estimation and bootstrap bandwidth selection in mixture cure models

Bootstrapping the Mean Integrated Squared Error

Evaluating the Ability of Tree‐Based Methods and Logistic Regression for the Detection of SNP‐SNP Interaction

Forecasting next-day electricity demand and price using nonparametric functional methods

Contact Info

Product

Resources

About