This data explosion driven by advancements in genomic research, such as high-throughput sequencing techniques, is constantly challenging conventional methods used in genomics. In parallel with the urgent demand for robust algorithms, deep learning has succeeded in a variety of fields such as vision, speech, and text processing. Yet genomics entails unique challenges to deep learning since we are expecting from deep learning a superhuman intelligence that explores beyond our knowledge to interpret the genome. A powerful deep learning model should rely on insightful utilization of task-specific knowledge. In this paper, we briefly discuss the strengths of different deep learning models from a genomic perspective so as to fit each particular task with a proper deep architecture, and remark on practical considerations of developing modern deep learning architectures for genomics. We also provide a concise review of deep learning applications in various aspects of genomic research, as well as pointing out current challenges and potential research directions for future genomics applications.
BackgroundGenome-wide Association Studies (GWAS) have contributed to unraveling associations between genetic variants in the human genome and complex traits for more than a decade. While many works have been invented as follow-ups to detect interactions between SNPs, epistasis are still yet to be modeled and discovered more thoroughly.ResultsIn this paper, following the previous study of detecting marginal epistasis signals, and motivated by the universal approximation power of deep learning, we propose a neural network method that can potentially model arbitrary interactions between SNPs in genetic association studies as an extension to the mixed models in correcting confounding factors. Our method, namely Deep Mixed Model, consists of two components: 1) a confounding factor correction component, which is a large-kernel convolution neural network that focuses on calibrating the residual phenotypes by removing factors such as population stratification, and 2) a fixed-effect estimation component, which mainly consists of an Long-short Term Memory (LSTM) model that estimates the association effect size of SNPs with the residual phenotype.ConclusionsAfter validating the performance of our method using simulation experiments, we further apply it to Alzheimer’s disease data sets. Our results help gain some explorative understandings of the genetic architecture of Alzheimer’s disease.
The field of coreference resolution has witnessed significant advancements since the introduction of deep learning-based models. In this paper, we replicate the state-of-the-art coreference resolution model and perform a thorough error analysis. We identify a potential limitation of the current approach in terms of its treatment of grammatical constructions within sentences. Furthermore, the model struggles to leverage contextual information across sentences, resulting in suboptimal accuracy when resolving mentions that span multiple sentences. Motivated by these observations, we propose an approach that integrates linguistic information throughout the entire architecture. Our innovative contributions include multitask learning with part-of-speech (POS) tagging, supervision of intermediate scores, and self-attention mechanisms that operate across sentences. By incorporating these linguisticinspired modules, we not only achieve a modest improvement in the F1 score on CoNLL 2012 dataset, but we also perform qualitative analysis to ascertain whether our model invisibly surpasses the baseline performance. Our findings demonstrate that our model successfully learns linguistic signals that are absent in the original baseline. We posit that these enhance ments may have gone undetected due to annotation errors, but they nonetheless lead to a more accurate understanding of coreference resolution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.