73Over 100 genetic loci harbor schizophrenia associated variants, yet how these common 74 variants confer risk is uncertain. The CommonMind Consortium has sequenced dorsolateral 75 prefrontal cortex RNA from schizophrenia cases (n=258) and control subjects (n=279), creating 76 the largest publicly available resource to date of gene expression and its genetic regulation; ~5 77 times larger than the latest release of GTEx. Using this resource, we find that ~20% of the 78 schizophrenia risk loci have common variants that could explain regulation of brain gene 79 expression. In five loci, these variants modulate expression of a single gene: FURIN, TSNARE1, 80 CNTN4, CLCN3 or SNAP91. Experimentally altered expression of three of them, FURIN, 81 TSNARE1, and CNTN4, perturbs the proliferation and apoptotic index of neural progenitors and 82 leads to neuroanatomical deficits in zebrafish. Furthermore, shRNA mediated knock-down of 83 FURIN in neural progenitor cells derived from human induced pluripotent stem cells produces 84 abnormal neural migration. Although 4.2% of genes (N = 693) display significant differential 85 expression between cases and controls, 44% show some evidence for differential expression. 86All fold changes are ≤ 1.33, and an independent cohort yields similar differential expression for 87 these 693 genes (r = 0.58). These findings are consistent with schizophrenia being highly 88 polygenic, as has been reported in investigations of common and rare genetic variation. Co-89 expression analyses identify a gene module that shows enrichment for genetic associations and 90 is thus relevant for schizophrenia. Taken together, these results pave the way for mechanistic 91 interpretations of genetic liability for schizophrenia and other brain diseases. 4The human brain is complicated and not well understood. Seemingly straightforward 93 fundamental information such as which genes are expressed therein and what functions they 94 perform are only partially characterized. To overcome these obstacles, we established the 95 CommonMind Consortium (CMC; www.synpase.org/CMC), a public-private partnership to 96 generate functional genomic data in brain samples obtained from autopsies of cases with and 97 without severe psychiatric disorders. The CMC is the largest existing collection of collaborating 98 brain banks and includes over 1,150 samples. A wide spectrum of data is being generated on 99 these samples including regional gene expression, epigenomics (cell-type specific histone 100 modifications and open chromatin), whole genome sequencing, and somatic mosaicism. 101 102 Schizophrenia (SCZ), affecting roughly 0.7% of adults, is a severe psychiatric disorder 103 characterized by abnormalities in thought and cognition (1). Despite a century of evidence 104 establishing its genetic basis, only recently have specific genetic risk factors been conclusively 105identified, including rare copy number variants (2) and >100 common variants (3). However, 106 there is not a one-to-one Mendelian mapping between these SCZ ris...
52Alzheimer's disease (AD) is a complex and heterogenous brain disease that affects multiple inter-related 53 biological processes. This complexity contributes, in part, to existing difficulties in the identification of 54 successful disease-modifying therapeutic strategies. To address this, systems approaches are being used to 55 characterize AD-related disruption in molecular state. To evaluate the consistency across these molecular 56 models, a consensus atlas of the human brain transcriptome was developed through coexpression meta-57 analysis across the AMP-AD consortium. Consensus analysis was performed across five coexpression 58 methods used to analyze RNA-seq data collected from 2114 samples across 7 brain regions and 3 research 59 studies. From this analysis, five consensus clusters were identified that described the major sources of 60 AD-related alterations in transcriptional state that were consistent across studies, methods, and samples. 61AD genetic associations, previously studied AD-related biological processes, and AD targets under active 62 investigation were enriched in only three of these five clusters. The remaining two clusters demonstrated 63 strong heterogeneity between males and females in AD-related expression that was consistently observed 64 across studies. AD transcriptional modules identified by systems analysis of individual AMP-AD teams 65 were all represented in one of these five consensus clusters except ROS/MAP-identified Module 109, 66 which was specific for genes that showed the strongest association with changes in AD-related gene 67 expression across consensus clusters. The other two AMP-AD transcriptional analyses reported modules 68 that were enriched in one of the two sex-specific Consensus Clusters. The fifth cluster has not been 69 previously identified and was enriched for genes related to proteostasis. This study provides an atlas to 70 map across biological inquiries of AD with the goal of supporting an expansion in AD target discovery 71 efforts.
Remote health assessments that gather real-world data (RWD) outside of clinic settings require a clear understanding of appropriate methods for data collection, quality assessment, analysis and interpretation. Here, we examine the performance and limitations of smartphones in collecting RWD in the remote mPower observational study of Parkinson's Disease (PD). Within the first six months of study commencement, 960 participants had enrolled and performed at least 5 self-administered active PD symptom assessments (speeded tapping, gait/balance, phonation, or memory). Task performance, especially speeded tapping, was predictive of self-reported PD status (AUC=0.8) and correlated with in-clinic evaluation of disease severity (r =0.71; p<1.8×10 -6 ) when compared with motor MDS-UPDRS). Although remote assessment requires careful consideration for accurate interpretation of RWD, our results support the use of smartphones and wearables in objective and personalized disease assessments.RWD offers the opportunity to improve our understanding and management of health and disease outside of the clinical setting. 1 An increasingly popular method for collecting RWD is the use of remote digital assessments that allow frequent sampling and have been used to aid in the diagnosis, treatment, and monitoring of multiple conditions, including atrial fibrillation and diabetes. 2,3 When used in population-based studies, remote assessment can increase understanding of heterogeneity in disease manifestation, the Bas Bloem currently serves as Editor-in-Chief for the Journal of Parkinson's disease, serves on the editorial board of Practical Neurology and Digital Biomarkers, has received honoraria from serving on the scientific advisory board for Zambon, Biogen, UCB and Walk with Path, has
Collection of high-dimensional, longitudinal digital health data has the potential to support a wide-variety of research and clinical applications including diagnostics and longitudinal health tracking. Algorithms that process these data and inform digital diagnostics are typically developed using training and test sets generated from multiple repeated measures collected across a set of individuals. However, the inclusion of repeated measurements is not always appropriately taken into account in the analytical evaluations of predictive performance. The assignment of repeated measurements from each individual to both the training and the test sets (“record-wise” data split) is a common practice and can lead to massive underestimation of the prediction error due to the presence of “identity confounding.” In essence, these models learn to identify subjects, in addition to diagnostic signal. Here, we present a method that can be used to effectively calculate the amount of identity confounding learned by classifiers developed using a record-wise data split. By applying this method to several real datasets, we demonstrate that identity confounding is a serious issue in digital health studies and that record-wise data splits for machine learning- based applications need to be avoided.
30Background: Late-onset Alzheimer's disease (LOAD) is the most common form of 31 dementia worldwide. To date, animal models of Alzheimer's have focused on rare 32 familial mutations, due to a lack of frank neuropathology from models based on 33 common disease genes. Recent multi-cohort studies of postmortem human brain 34 transcriptomes have identified a set of 30 gene co-expression modules associated with 35 LOAD, providing a molecular catalog of relevant endophenotypes. Results: This 36 resource enables precise gene-based alignment between new animal models and 37 human molecular signatures of disease. Here, we describe a new resource to efficiently 38 screen mouse models for LOAD relevance. A new NanoString nCounter® Mouse AD 39 panel was designed to correlate key human disease processes and pathways with 40 mRNA from mouse brains. Analysis of three mouse models based on LOAD genetics, 41 carrying APOE4 and TREM2*R47H alleles, demonstrated overlaps with distinct human 42 AD modules that, in turn, are functionally enriched in key disease-associated pathways. 43 Comprehensive comparison with full transcriptome data from same-sample RNA-Seq 44 shows strong correlation between gene expression changes independent of 45 experimental platform. Conclusions: Taken together, we show that the nCounter 46 Mouse AD panel offers a rapid, cost-effective and highly reproducible approach to 47 assess disease relevance of potential LOAD mouse models.48 49 50 51 3 BACKGROUND 52Late-onset Alzheimer's disease (LOAD) is the most common cause of dementia 53 worldwide (1). LOAD presents as a heterogenous disease with highly variable 54 outcomes. Recent efforts have been made to molecularly characterize LOAD using 55 large cohorts of post-mortem human brain transcriptomic data (2). Systems-level 56 analysis of these large human data sets has revealed key drivers and molecular 57 pathways that reflect specific changes resulting from disease (2,3). These studies have 58 been primarily driven by gene co-expression analyses that reduce transcriptomes to 59 modules representing specific disease processes or cell types across heterogenous 60 tissue samples (2,4,5). Similar approaches have been used to characterize mouse 61 models of neurodegenerative disease (6). Detailed cross-species analysis reveals a 62 translational gap between animal models and human disease, as no existing models 63 fully recapitulate pathologies associated with LOAD (7,8). New platforms to rapidly 64 assess the translational relevance of new animal models of LOAD will allow efficient 65 identification of the most promising preclinical models. 66In this study, we describe a novel gene expression panel to assess LOAD-relevance of 67 mouse models based on expression of key genes in the brain. We used a recent human 68 molecular disease catalog based on harmonized co-expression data from three 69 independent post mortem brain cohorts (ROSMAP, Mayo, Mount Sinai Brain bank) (9-70 11) and seven brain regions that define 30 human co-expression modules and five 71 consensus...
Consumer wearables and sensors are a rich source of data about patients’ daily disease and symptom burden, particularly in the case of movement disorders like Parkinson’s disease (PD). However, interpreting these complex data into so-called digital biomarkers requires complicated analytical approaches, and validating these biomarkers requires sufficient data and unbiased evaluation methods. Here we describe the use of crowdsourcing to specifically evaluate and benchmark features derived from accelerometer and gyroscope data in two different datasets to predict the presence of PD and severity of three PD symptoms: tremor, dyskinesia, and bradykinesia. Forty teams from around the world submitted features, and achieved drastically improved predictive performance for PD status (best AUROC = 0.87), as well as tremor- (best AUPR = 0.75), dyskinesia- (best AUPR = 0.48) and bradykinesia-severity (best AUPR = 0.95).
We propose hypothesis tests for detecting dopaminergic medication response in Parkinson disease patients, using longitudinal sensor data collected by smartphones. The processed data is composed of multiple features extracted from active tapping tasks performed by the participant on a daily basis, before and after medication, over several months. Each extracted feature corresponds to a time series of measurements annotated according to whether the measurement was taken before or after the patient has taken his/her medication. Even though the data is longitudinal in nature, we show that simple hypothesis tests for detecting medication response, which ignore the serial correlation structure of the data, are still statistically valid, showing type I error rates at the nominal level. We propose two distinct personalized testing approaches. In the first, we combine multiple featurespecific tests into a single union-intersection test. In the second, we construct personalized classifiers of the before/after medication labels using all the extracted features of a given participant, and test the null hypothesis that the area under the receiver operating characteristic curve of the classifier is equal to 1/2. We compare the statistical power of the personalized classifier tests and personalized union-intersection tests in a simulation study, and illustrate the performance of the proposed tests using data from mPower Parkinsons disease study, recently launched as part of Apples ResearchKit mobile platform. Our results suggest that the personalized tests, which ignore the longitudinal aspect of the data, can perform well in real data analyses, suggesting they might be used as a sound baseline approach, to which more sophisticated methods can be compared to.
Motivation Late onset Alzheimer’s disease is currently a disease with no known effective treatment options. To better understand disease, new multi-omic data-sets have recently been generated with the goal of identifying molecular causes of disease. However, most analytic studies using these datasets focus on uni-modal analysis of the data. Here, we propose a data driven approach to integrate multiple data types and analytic outcomes to aggregate evidences to support the hypothesis that a gene is a genetic driver of the disease. The main algorithmic contributions of our article are: (i) a general machine learning framework to learn the key characteristics of a few known driver genes from multiple feature sets and identifying other potential driver genes which have similar feature representations, and (ii) A flexible ranking scheme with the ability to integrate external validation in the form of Genome Wide Association Study summary statistics. While we currently focus on demonstrating the effectiveness of the approach using different analytic outcomes from RNA-Seq studies, this method is easily generalizable to other data modalities and analysis types. Results We demonstrate the utility of our machine learning algorithm on two benchmark multiview datasets by significantly outperforming the baseline approaches in predicting missing labels. We then use the algorithm to predict and rank potential drivers of Alzheimer’s. We show that our ranked genes show a significant enrichment for single nucleotide polymorphisms associated with Alzheimer’s and are enriched in pathways that have been previously associated with the disease. Availability and implementation Source code and link to all feature sets is available at https://github.com/Sage-Bionetworks/EvidenceAggregatedDriverRanking.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.