Studies using haplotypes of multiple tightly linked markers are more informative than those using a single marker. However, studies based on multimarker haplotypes have some difficulties. First, if we consider each haplotype as an allele and use the conventional single-marker transmission/disequilibrium test (TDT), then the rapid increase in the degrees of freedom with an increasing number of markers means that the statistical power of the conventional tests will be low. Second, the parental haplotypes cannot always be unambiguously reconstructed. In the present article, we propose a haplotype-sharing TDT (HS-TDT) for linkage or association between a disease-susceptibility locus and a chromosome region in which several tightly linked markers have been typed. This method is applicable to both quantitative traits and qualitative traits. It is applicable to any size of nuclear family, with or without ambiguous phase information, and it is applicable to any number of alleles at each of the markers. The degrees of freedom (in a broad sense) of the test increase linearly as the number of markers considered increases but do not increase as the number of alleles at the markers increases. Our simulation results show that the HS-TDT has the correct type I error rate in structured populations and that, in most cases, the power of HS-TDT is higher than the power of the existing single-marker TDTs and haplotype-based TDTs.
Cancer risk is determined by a complex interplay of genetic and environmental factors. Genome-wide association studies (GWAS) have identified hundreds of common (minor allele frequency [MAF]>0.05) and less common (0.01
BackgroundConstructing coexpression networks and performing network analysis using large-scale gene expression data sets is an effective way to uncover new biological knowledge; however, the methods used for gene association in constructing these coexpression networks have not been thoroughly evaluated. Since different methods lead to structurally different coexpression networks and provide different information, selecting the optimal gene association method is critical.Methods and ResultsIn this study, we compared eight gene association methods – Spearman rank correlation, Weighted Rank Correlation, Kendall, Hoeffding's D measure, Theil-Sen, Rank Theil-Sen, Distance Covariance, and Pearson – and focused on their true knowledge discovery rates in associating pathway genes and construction coordination networks of regulatory genes. We also examined the behaviors of different methods to microarray data with different properties, and whether the biological processes affect the efficiency of different methods.ConclusionsWe found that the Spearman, Hoeffding and Kendall methods are effective in identifying coexpressed pathway genes, whereas the Theil-sen, Rank Theil-Sen, Spearman, and Weighted Rank methods perform well in identifying coordinated transcription factors that control the same biological processes and traits. Surprisingly, the widely used Pearson method is generally less efficient, and so is the Distance Covariance method that can find gene pairs of multiple relationships. Some analyses we did clearly show Pearson and Distance Covariance methods have distinct behaviors as compared to all other six methods. The efficiencies of different methods vary with the data properties to some degree and are largely contingent upon the biological processes, which necessitates the pre-analysis to identify the best performing method for gene association and coexpression network construction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.