Endeavour (http://www.esat.kuleuven.be/endeavourweb; this web site is free and open to all users and there is no login requirement) is a web resource for the prioritization of candidate genes. Using a training set of genes known to be involved in a biological process of interest, our approach consists of (i) inferring several models (based on various genomic data sources), (ii) applying each model to the candidate genes to rank those candidates against the profile of the known genes and (iii) merging the several rankings into a global ranking of the candidate genes. In the present article, we describe the latest developments of Endeavour. First, we provide a web-based user interface, besides our Java client, to make Endeavour more universally accessible. Second, we support multiple species: in addition to Homo sapiens, we now provide gene prioritization for three major model organisms: Mus musculus, Rattus norvegicus and Caenorhabditis elegans. Third, Endeavour makes use of additional data sources and is now including numerous databases: ontologies and annotations, protein–protein interactions, cis-regulatory information, gene expression data sets, sequence information and text-mining data. We tested the novel version of Endeavour on 32 recent disease gene associations from the literature. Additionally, we describe a number of recent independent studies that made use of Endeavour to prioritize candidate genes for obesity and Type II diabetes, cleft lip and cleft palate, and pulmonary fibrosis.
BackgroundThis paper introduces the notion of optimizing different norms in the dual problem of support vector machines with multiple kernels. The selection of norms yields different extensions of multiple kernel learning (MKL) such as L∞, L1, and L2 MKL. In particular, L2 MKL is a novel method that leads to non-sparse optimal kernel coefficients, which is different from the sparse kernel coefficients optimized by the existing L∞ MKL method. In real biomedical applications, L2 MKL may have more advantages over sparse integration method for thoroughly combining complementary information in heterogeneous data sources.ResultsWe provide a theoretical analysis of the relationship between the L2 optimization of kernels in the dual problem with the L2 coefficient regularization in the primal problem. Understanding the dual L2 problem grants a unified view on MKL and enables us to extend the L2 method to a wide range of machine learning problems. We implement L2 MKL for ranking and classification problems and compare its performance with the sparse L∞ and the averaging L1 MKL methods. The experiments are carried out on six real biomedical data sets and two large scale UCI data sets. L2 MKL yields better performance on most of the benchmark data sets. In particular, we propose a novel L2 MKL least squares support vector machine (LSSVM) algorithm, which is shown to be an efficient and promising classifier for large scale data sets processing.ConclusionsThis paper extends the statistical framework of genomic data fusion based on MKL. Allowing non-sparse weights on the data sources is an attractive option in settings where we believe most data sources to be relevant to the problem at hand and want to avoid a "winner-takes-all" effect seen in L∞ MKL, which can be detrimental to the performance in prospective studies. The notion of optimizing L2 kernels can be straightforwardly extended to ranking, classification, regression, and clustering algorithms. To tackle the computational burden of MKL, this paper proposes several novel LSSVM based MKL algorithms. Systematic comparison on real data sets shows that LSSVM MKL has comparable performance as the conventional SVM MKL algorithms. Moreover, large scale numerical experiments indicate that when cast as semi-infinite programming, LSSVM MKL can be solved more efficiently than SVM MKL.AvailabilityThe MATLAB code of algorithms implemented in this paper is downloadable from http://homes.esat.kuleuven.be/~sistawww/bioi/syu/l2lssvm.html.
The concept of deep learning and classical architectures of deep neural networks are introduced. A review of state-of-the-art deep learning methods in geophysical applications is provided. The future directions for developing new deep learning methods in geophysics are discussed.
We propose a new hybrid clustering framework to incorporate text mining with bibliometrics in journal set analysis. The framework integrates two different approaches: clustering ensemble and kernel-fusion clustering. To improve the flexibility and the efficiency of processing large-scale data, we propose an information-based weighting scheme to leverage the effect of multiple data sources in hybrid clustering. Three different algorithms are extended by the proposed weighting scheme and they are employed on a large journal set retrieved from the Web of Science (WoS) database. The clustering performance of the proposed algorithms is systematically evaluated using multiple evaluation methods, and they were cross-compared with alternative methods. Experimental results demonstrate that the proposed weighted hybrid clustering strategy is superior to other methods in clustering performance and efficiency. The proposed approach also provides a more refined structural mapping of journal sets, which is useful for monitoring and detecting new trends in different scientific fields.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.