Fast Implementation of <i>ℓ</i> <sub>1</sub> Regularized Learning Algorithms Using Gradient Descent Methods

Cai, Ya-Chun; Sun, Yijun; Cheng, Yubo; Li, Jian; Goodison, Steve

doi:10.1137/1.9781611972801.75

Cited by 10 publications

(10 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this situation, special care must be taken to avoid overfitting problems. A commonly used practice is to select a small feature subset so that the performance of a learning algorithm is optimized ( 21–23 ). For the purpose of this article, we used regularized logistical regression to perform feature selection and classification simultaneously ( 23 ).…”

Section: Methodsmentioning

confidence: 99%

Advanced computational algorithms for microbial community analysis using massive 16S rRNA sequence data

Sun¹,

Cai²,

Mai³

et al. 2010

Nucleic Acids Research

Self Cite

View full text Add to dashboard Cite

With the aid of next-generation sequencing technology, researchers can now obtain millions of microbial signature sequences for diverse applications ranging from human epidemiological studies to global ocean surveys. The development of advanced computational strategies to maximally extract pertinent information from massive nucleotide data has become a major focus of the bioinformatics community. Here, we describe a novel analytical strategy including discriminant and topology analyses that enables researchers to deeply investigate the hidden world of microbial communities, far beyond basic microbial diversity estimation. We demonstrate the utility of our approach through a computational study performed on a previously published massive human gut 16S rRNA data set. The application of discriminant and topology analyses enabled us to derive quantitative disease-associated microbial signatures and describe microbial community structure in far more detail than previously achievable. Our approach provides rigorous statistical tools for sequence-based studies aimed at elucidating associations between known or unknown organisms and a variety of physiological or environmental conditions.

show abstract

Section: Methodsmentioning

confidence: 99%

Advanced computational algorithms for microbial community analysis using massive 16S rRNA sequence data

Sun¹,

Cai²,

Mai³

et al. 2010

Nucleic Acids Research

Self Cite

View full text Add to dashboard Cite

show abstract

“…The two steps were iterated until convergence. We used our recently developed gradient-descent-based algorithm to solve the above optimization problem efficiently [26]. By using the fixed-point theory [27], it can be proved that the algorithm converges to a unique solution regardless of the initial weights if the kernel width is properly selected.…”

Section: Supervised Learning Approach To Identifying Cancer Progressimentioning

confidence: 99%

Cancer progression modeling using static sample data

Sun¹,

Ye²,

Nowak³

et al. 2014

Genome Biol

Self Cite

View full text Add to dashboard Cite

As molecular profiling data continue to accumulate, the design of integrative computational analyses that can provide insights into the dynamic aspects of cancer progression becomes feasible. Here, we present a novel computational method for the construction of cancer progression models based on the analysis of static tumor samples. We demonstrate the reliability of the method with simulated data, and describe the application to breast cancer data. Our findings support a linear, branching model for breast cancer progression. An interactive model facilitates the identification of key molecular events in the advance of disease to malignancy.

show abstract

“…We used L1 logistic regression to perform the feature selection procedures due to its ability to dispose the high dimensional data [ 40 ]. The model describes were as follows:…”

Section: Methodsmentioning

confidence: 99%

A comparative study of improvements Pre-filter methods bring on feature selection using microarray data

Wang

Fan

Cai

2014

Health Inf Sci Syst

Self Cite

View full text Add to dashboard Cite

BackgroundFeature selection techniques have become an apparent need in biomarker discoveries with the development of microarray. However, the high dimensional nature of microarray made feature selection become time-consuming. To overcome such difficulties, filter data according to the background knowledge before applying feature selection techniques has become a hot topic in microarray analysis. Different methods may affect final results greatly, thus it is important to evaluate these pre-filter methods in a system way.MethodsIn this paper, we compared the performance of statistical-based, biological-based pre-filter methods and the combination of them on microRNA-mRNA parallel expression profiles using L1 logistic regression as feature selection techniques. Four types of data were built for both microRNA and mRNA expression profiles.ResultsResults showed that pre-filter methods could reduce the number of features greatly for both mRNA and microRNA expression datasets. The features selected after pre-filter procedures were shown to be significant in biological levels such as biology process and microRNA functions. Analyses of classification performance based on precision showed the pre-filter methods were necessary when the number of raw features was much bigger than that of samples. All the computing time was greatly shortened after pre-filter procedures.ConclusionsWith similar or better classification improvements, less but biological significant features, pre-filter-based feature selection should be taken into consideration if researchers need fast results when facing complex computing problems in bioinformatics.Electronic supplementary materialThe online version of this article (doi:10.1186/2047-2501-2-7) contains supplementary material, which is available to authorized users.

show abstract

Fast Implementation of ℓ ₁ Regularized Learning Algorithms Using Gradient Descent Methods

Cited by 10 publications

References 24 publications

Advanced computational algorithms for microbial community analysis using massive 16S rRNA sequence data

Advanced computational algorithms for microbial community analysis using massive 16S rRNA sequence data

Cancer progression modeling using static sample data

A comparative study of improvements Pre-filter methods bring on feature selection using microarray data

Contact Info

Product

Resources

About

Fast Implementation of ℓ 1 Regularized Learning Algorithms Using Gradient Descent Methods

Cited by 10 publications

References 24 publications

Advanced computational algorithms for microbial community analysis using massive 16S rRNA sequence data

Advanced computational algorithms for microbial community analysis using massive 16S rRNA sequence data

Cancer progression modeling using static sample data

A comparative study of improvements Pre-filter methods bring on feature selection using microarray data

Contact Info

Product

Resources

About

Fast Implementation of ℓ ₁ Regularized Learning Algorithms Using Gradient Descent Methods