Rami N. Mahdi scite author profile

Rouchka

2009

PLoS ONE

Accurate identification of promoter regions and transcription start sites (TSS) in genomic DNA allows for a more complete understanding of the structure of genes and gene regulation within a given genome. Many recently published methods have achieved high identification accuracy of TSS. However, models providing more accurate modeling of promoters and TSS are needed. A novel identification method for identifying transcription start sites that improves the accuracy of TSS recognition for recently published methods is proposed. This method incorporates a metric feature based on oligonucleotide positional frequencies, taking into account the nature of promoters. A radial basis function neural network for identifying transcription start sites (RBF-TSS) is proposed and employed as a classification algorithm. Using non-overlapping chunks (windows) of size 50 and 500 on the human genome, the proposed method achieves an area under the Receiver Operator Characteristic curve (auROC) of 94.75% and 95.08% respectively, providing increased performance over existing TSS prediction methods.

Reduced HyperBF Networks: Regularization by Explicit Complexity Reduction and Scaled Rprop-Based Training

IEEE Trans. Neural Netw.

Rouchka

2011

Hyper basis function (HyperBF) networks are generalized radial basis function neural networks (where the activation function is a radial function of a weighted distance). Such generalization provides HyperBF networks with high capacity to learn complex functions, which in turn make them susceptible to overfitting and poor generalization. Moreover, training a HyperBF network demands the weights, centers, and local scaling factors to be optimized simultaneously. In the case of a relatively large dataset with a large network structure, such optimization becomes computationally challenging. In this paper, a new regularization method that performs soft local dimension reduction in addition to weight decay is proposed. The regularized HyperBF network is shown to provide classification accuracy competitive to a support vector machine while requiring a significantly smaller network structure. Furthermore, a practical training to construct HyperBF networks is presented. Hierarchal clustering is used to initialize neurons followed by a gradient optimization using a scaled version of the Rprop algorithm with a localized partial backtracking step. Experimental results on seven datasets show that the proposed training provides faster and smoother convergence than the regular Rprop algorithm.

Empirical Bayes conditional independence graphs for regulatory network recovery

Madduri

Wang

et al. 2012

Semi-Supervised Clustering and Feature Discrimination with Instance-Level Constraints

Frigui

2007

Model Based Unsupervised Learning Guided by Abundant Background Samples

Rouchka

2008

Many data sets contain an abundance of background data or samples belonging to classes not currently under consideration. We present a new unsupervised learning method based on Fuzzy CMeans to learn sub models of a class using background samples to guide cluster split and merge operations. The proposed method demonstrates how background samples can be used to guide and improve the clustering process. The proposed method results in more accurate clusters and helps to escape locally minimum solutions. In addition, the number of clusters is determined for the class under consideration. The method demonstrates remarkable performance on both synthetic 2D and real world data from the MNIST dataset of hand written digits.