&lt;title&gt;Information-based sensor management for multitarget tracking&lt;/title&gt;

The interpretation of the results of large association studies encompassing much or all of the human genome faces the fundamental statistical problem that a correspondingly large number of single nucleotide polymorphisms markers will be spuriously flagged as significant. A common method of dealing with these false positives is to raise the significance level for the individual tests for association of each marker. Any such adjustment for multiple testing is ultimately based on a more or less precise estimate for the actual overall type I error probability. We estimate this probability for association tests for correlated markers and show that it depends in a nonlinear way on the significance level for the individual tests. This dependence of the effective number of tests is not taken into account by existing multiple-testing corrections, leading to widely overestimated results. We demonstrate a simple correction for multiple testing, which can easily be calculated from the pairwise correlation and gives far more realistic estimates for the effective number of tests than previous formulae. The calculation is considerably faster than with other methods and hence applicable on a genome-wide scale. The efficacy of our method is shown on a constructed example with highly correlated markers as well as on real data sets, including a full genome scan where a conservative estimate only 8% above the permutation estimate is obtained in about 1% of computation time. As the calculation is based on pairwise correlations between markers, it can be performed at the stage of study design using public databases. INTRODUCTIONCase-control studies to identify single nucleotide polymorphism (SNP) markers associated with a disease are a commonly used methodology to pinpoint genes which may play a central role in understanding the genetic background of complex diseases. The availability of efficient and reliable techniques of genotyping has made it possible to extend the scope of association studies to encompass the whole genome, cf. the recent [WTCCC, 2007] project.A fundamental difficulty in the interpretation of the results of such large-scale association studies is presented by the following simple fact of statistical theory, also known as the multiple-testing problem. When a number N of statistical tests are performed, each of which has a type I error probability a, the expected number of (false) significant findings, assuming the null hypothesis in each test, is equal to Na, irrespective of whether the tests are statistically independent or not, and whether they test the same or different hypotheses. As a consequence, in a largescale association study testing a large number of SNP marker loci, any true association result will be accompanied and obscured by a correspondingly large number of spurious associations.A widely accepted approach to deal with this problem is a multiple-testing correction, adjusting the significance level for each test to a value a such that the overall type I error for the study, i.e. the probability P...

show abstract

Design of Case‐controls Studies with Unscreened Controls

Moskvina

Holmans

Schmidt

et al. 2005

Annals of Human Genetics

104

View full text Add to dashboard Cite

SummaryTraditionally in genetic case-control studies controls have been screened to exclude subjects with a personal history of illness. This control group has the advantage of optimal power to detect loci involved in illness, but requires more work and may incur substantial cost in recruitment. An alternative approach to screening is to use unscreened controls sampled from the general population. Such controls are generally plentiful and inexpensive, but in general there is a risk that some may have the same disease as the cases, which will reduce power to detect associations. We have quantified the extent of this power loss, and produced mathematical formulae for the number of unscreened controls necessary to achieve the same power as a fixed sample of screened controls. The effect of using unscreened controls will also depend on the ratio of the number of screened controls to cases specified in the original study design, and this is also investigated. We have also investigated the cost-benefits of the screened and unscreened approaches, according to variation in the relative costs of sampling screened and unscreened controls, together with genotyping costs. We have, thus, identified the range of situations in which using unscreened controls is a cost-effective alternative to the screened control method and could be considered when designing a study. In many of the typical, real-world situations in complex genetics, the use of unscreened controls is potentially cost-effective and can, in general, be considered for disorders with population prevalence K p < 0.2. With the steady reduction in genotyping costs and the availability of common sets of "population controls" this design is likely to become increasingly cost effective.

show abstract

Critical Coupling Constants and Eigenvalue Asymptotics of Perturbed Periodic Sturm-Liouville Operators

Schmidt

2000

Communications in Mathematical Physics

View full text Add to dashboard Cite

A Remark on Boundary Value Problems for the Dirac Operator

Schmidt

1995

Q J Math

View full text Add to dashboard Cite

Periodic Differential Operators

Brown¹,

Eastham²,

Schmidt³

2013

View full text Add to dashboard Cite

POLARIS: Polygenic LD‐adjusted risk score approach for set‐based analysis of GWAS data

Baker

Schmidt

Sims

et al. 2018

Genetic Epidemiology

View full text Add to dashboard Cite

Polygenic risk scores (PRSs) are a method to summarize the additive trait variance captured by a set of SNPs, and can increase the power of set‐based analyses by leveraging public genome‐wide association study (GWAS) datasets. PRS aims to assess the genetic liability to some phenotype on the basis of polygenic risk for the same or different phenotype estimated from independent data. We propose the application of PRSs as a set‐based method with an additional component of adjustment for linkage disequilibrium (LD), with potential extension of the PRS approach to analyze biologically meaningful SNP sets. We call this method POLARIS: POlygenic Ld‐Adjusted RIsk Score. POLARIS identifies the LD structure of SNPs using spectral decomposition of the SNP correlation matrix and replaces the individuals' SNP allele counts with LD‐adjusted dosages. Using a raw genotype dataset together with SNP effect sizes from a second independent dataset, POLARIS can be used for set‐based analysis. MAGMA is an alternative set‐based approach employing principal component analysis to account for LD between markers in a raw genotype dataset. We used simulations, both with simple constructed and real LD‐structure, to compare the power of these methods. POLARIS shows more power than MAGMA applied to the raw genotype dataset only, but less or comparable power to combined analysis of both datasets. POLARIS has the advantages that it produces a risk score per person per set using all available SNPs, and aims to increase power by leveraging the effect sizes from the discovery set in a self‐contained test of association in the test dataset.

show abstract

Absolutely continuous spectrum of Dirac systems with potentials infinite at infinity

Schmidt

1997

Math. Proc. Camb. Phil. Soc.

View full text Add to dashboard Cite

It is shown that the spectrum of a one-dimensional Dirac operator with a potential q tending to infinity at infinity, and such that the positive variation of 1\q is bounded, covers the whole real line and is purely absolutely continuous. An example is given to show that in general, pure absolute continuity is lost if the condition on the positive variation is dropped. The appendix contains a direct proof for the special case of subordinacy theory used.

show abstract

Predictive modeling of schizophrenia from genomic data: Comparison of polygenic risk score with kernel support vector machines approach

Vivian-Griffiths

Baker

Schmidt

et al. 2018

American J of Med Genetics Pt B

View full text Add to dashboard Cite

A major controversy in psychiatric genetics is whether nonadditive genetic interaction effects contribute to the risk of highly polygenic disorders. We applied a support vector machines (SVMs) approach, which is capable of building linear and nonlinear models using kernel methods, to classify cases from controls in a large schizophrenia case–control sample of 11,853 subjects (5,554 cases and 6,299 controls) and compared its prediction accuracy with the polygenic risk score (PRS) approach. We also investigated whether SVMs are a suitable approach to detecting nonlinear genetic effects, that is, interactions. We found that PRS provided more accurate case/control classification than either linear or nonlinear SVMs, and give a tentative explanation why PRS outperforms both multivariate regression and linear kernel SVMs. In addition, we observe that nonlinear kernel SVMs showed higher classification accuracy than linear SVMs when a large number of SNPs are entered into the model. We conclude that SVMs are a potential tool for assessing the presence of interactions, prior to searching for them explicitly.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Karl Michael Schmidt

On multiple‐testing correction in genome‐wide association studies

Design of Case‐controls Studies with Unscreened Controls

Critical Coupling Constants and Eigenvalue Asymptotics of Perturbed Periodic Sturm-Liouville Operators

A Remark on Boundary Value Problems for the Dirac Operator

Periodic Differential Operators

POLARIS: Polygenic LD‐adjusted risk score approach for set‐based analysis of GWAS data

Absolutely continuous spectrum of Dirac systems with potentials infinite at infinity

Predictive modeling of schizophrenia from genomic data: Comparison of polygenic risk score with kernel support vector machines approach

Contact Info

Product

Resources

About