Juan Carlos Correa scite author profile

The literature on GWAS (genome-wide association studies) data suggests that very large sample sizes (for example, 50,000 cases and 50,000 controls) may be required to detect significant associations of genomic regions for complex disorders such as Alzheimer's disease (AD). Because of the challenges of obtaining such large cohorts, we describe here a novel sequential strategy that combines pooling of DNA and bootstrapping (pbGWAS) in order to significantly increase the statistical power and exponentially reduce expenses. We applied this method to a very homogeneous sample of patients belonging to a unique and clinically well-characterized multigenerational pedigree with one of the most severe forms of early onset AD, carrying the PSEN1 p.Glu280Ala mutation (often referred to as E280A mutation), which originated as a consequence of a founder effect. In this cohort, we identified novel loci genome-wide significantly associated as modifiers of the age of onset of AD (CD44, rs187116, P = 1.29 × 10–12; NPHP1, rs10173717, P = 1.74 × 10–12; CADPS2, rs3757536, P = 1.54 × 10–10; GREM2, rs12129547, P = 1.69 × 10–13, among others) as well as other loci known to be associated with AD. Regions identified by pbGWAS were confirmed by subsequent individual genotyping. The pbGWAS methodology and the genes it targeted could provide important insights in determining the genetic causes of AD and other complex conditions.

show abstract

A new estimator of entropy

Correa

1995

Communications in Statistics - Theory and Methods

100

View full text Add to dashboard Cite

A machine learning approach to big data regression analysis of real estate prices for inferential and predictive purposes

Rave

Correa

Echavarría

2019

Journal of Property Research

View full text Add to dashboard Cite

A new approach to the Box–Cox transformation

Vélez

Correa

Marmolejo‐Ramos

2015

Front. Appl. Math. Stat.

View full text Add to dashboard Cite

We propose a new methodology to estimate λ, the parameter of the Box-Cox transformation, as well as an alternative method to determine plausible values for it. The former is accomplished by defining a grid of values for λ and further perform a normality test on the λ-transformed data. The optimum value of λ, say * λ , is such that the p-value from the normality test is the highest. The set of plausible values is determined using the inverse probability method after plotting the p-values against the values of λ on the grid. Our methodology is illustrated with two real-world data sets. Furthermore, a simulation study suggests that our method improves the symmetry, kurtosis and, hence, the normality of data, making it a feasible alternative to the traditional Box-Cox transformation.

show abstract

Manipulating the alpha level cannot cure significance testing

Trafimow

Amrhein

Areshenkoff

et al. 2018

Preprint

View full text Add to dashboard Cite

We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p= .05 to .005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable alpha levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and sample size much more directly than significance testing does; but none of the statistical tools should be taken as the new magic method giving clear-cut mechanical answers. Inference should not be based on single studies at all, but on cumulative evidence from multiple independent studies. When evaluating the strength of the evidence, we should consider, for example, auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p-value threshold of .05, .01, .005, or anything else, is not acceptable.

show abstract

A New Method for Detecting Significant p-values with Applications to Genetic Data

Vélez¹,

Correa²,

Arcos‐Burgos³

2014

Rev. colomb. estad.

View full text Add to dashboard Cite

A new method for detecting significant p-values is described in this paper. This method, based on the distribution of the m-th order statistic of a U (0, 1) distribution, is shown to be suitable in applications where m → ∞ independent hypothesis are tested and it is of interest for a fixed type I error probability to determine those being significant while controlling the false positives. Equivalencies and comparisons between our method and others methods based-on p-values are also established, and a graphical representation of the distribution of the test statistic is depicted for different values of m. Finally, our proposal is illustrated with two microarray data sets.Key words: Extreme values theory, p-value, Type I error probability, Multiple testing, Genetic data. ResumenSe describe una nuevo método para la detección de valores p significativos. Este método, basado en el m-ésimo estadístico de orden de la distribución U (0, 1), es adecuado en casos en los que se realizan m → ∞ pruebas de hipótesis independientes y es de interés determinar aquellas que son significativas, controlando los falsos positivos, para una probabilidad de error tipo I predeterminada. Adicionalmente, se realiza una comparación con algunas a Ph.D Scholar. E-mail: jorge.velez@anu.edu.au b Associate professor. E-mail: jccorrea@unal.edu.co c Associate professor. E-mail: mauricio.arcos-burgos@anu.edu.au 70Jorge Iván Vélez, Juan Carlos Correa & Mauricio Arcos-Burgos pruebas clásicas y se grafica la distribución del estadístico de prueba para diferentes valores de m. Finalmente se ilustra el uso de la metodología con dos conjuntos de datos provenientes de estudios con microarreglos.Palabras clave: teoría de valores extremos, valor-p, probabilidad de error tipo I, comparaciones múltiples, datos genéticos.

show abstract

Should we think of a different Median estimator?

Vélez

Correa

2014

RevComEst

View full text Add to dashboard Cite

The median, one of the most popular measures of central tendency widely-used in the statistical practice, is often described as the numerical value separating the higher half of the sample from the lower half. Despite its popularity and applications, many people are not aware of the existence of several formulas to estimate this parameter. We present the results of a simulation study comparing the classic and the Harrell & Davis (1982) estimators of the median for seven continuous statistical distributions. It is shown that, relatively to the latter, the classic estimator performs poorly when the sample size is small. Based on these results, we strongly believe that the use of a better estimator of the median must be promoted.

show abstract

Use of control charts with regression analysis for autocorrelated data in the context of logistic financial budgeting

Rave

Muñoz-Giraldo

Correa

2017

Computers & Industrial Engineering

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.