Data mining methods for gene selection on the basis of gene expression arrays

Muszyński, Michał; Osowski, S.

doi:10.2478/amcs-2014-0048

Cited by 11 publications

(12 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The second applied filter method was feature correlation with class (COR), a univariate filter feature selection method that can be used as a pre-selection step in microarray gene selection [ 54 , 55 ]. The value of feature discrimination, S(f) , is expressed by where c is the mean value for the gene among both classes, c k is the mean value for the k th class gene, σ 2 ( f ) is the gene variance, and P k is the probability of appearance of the k th class in the dataset.…”

Section: Methodsmentioning

confidence: 99%

Selection and classification of gene expression in autism disorder: Use of a combination of statistical filters and a GBPSO-SVM algorithm

2017

View full text Add to dashboard Cite

In this work, gene expression in autism spectrum disorder (ASD) is analyzed with the goal of selecting the most attributed genes and performing classification. The objective was achieved by utilizing a combination of various statistical filters and a wrapper-based geometric binary particle swarm optimization-support vector machine (GBPSO-SVM) algorithm. The utilization of different filters was accentuated by incorporating a mean and median ratio criterion to remove very similar genes. The results showed that the most discriminative genes that were identified in the first and last selection steps included the presence of a repetitive gene (CAPS2), which was assigned as the gene most highly related to ASD risk. The merged gene subset that was selected by the GBPSO-SVM algorithm was able to enhance the classification accuracy.

show abstract

Section: Methodsmentioning

confidence: 99%

Selection and classification of gene expression in autism disorder: Use of a combination of statistical filters and a GBPSO-SVM algorithm

2017

View full text Add to dashboard Cite

show abstract

“…First, the procedure Generate creates a new concept and adds the new concept to concept lattice (lines 1-2). According to Proposition 9, we test every candidate in c.Children to find real children of newConcept (lines [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22]. Note that the concept c.children.indicator points to has already been obtained after executing the Preprocessprocedure.…”

Section: Generation and Removal Of Conceptsmentioning

confidence: 99%

“…Algorithm 1 reveals that c can be marked directly if c is a merged concept. When c is a deleted or modified concept, it requires comparisons between c andc.Parentsfor finding a parent with c.Intent = parent.Intent (lines [5][6][7][8][9][10][11][12][13]. This operation requires only one comparison in the best case or |G| comparisons at worst case between sets (intents) which takes at most O(|G||M | 2 ) time.…”

Section: Complexity Issuesmentioning

confidence: 99%

An Efficient Algorithm for Decreasing the Granularity Levels of Attributes in Formal Concept Analysis

Wan

Zou

2019

IEEE Access

View full text Add to dashboard Cite

In the formal concept analysis (FCA), a concept lattice represents the basic structure derived from Boolean data describing the relationships between objects and attributes. One of the basic problems of FCA is to control the structure of concept lattices to extract useful information. To explore a data set, sometimes we need to tune the structure of the corresponding concept lattice by merging a couple of finer attributes to a coarser attribute. The merged attribute can be interpreted as a coarser granularity level. In this paper, we propose an efficient algorithm called fold for decreasing the granularity levels of attributes. We analyzed and explored the relationships between concepts before and after decreasing the granularity level of an attribute. Based on those theoretical proofs, we propose an efficient method of classifying concepts to reduce the comparisons between the concepts compared with the original zoom-out algorithm. Moreover, we provide a preprocessing procedure to search for canonical generators and help restore the covering relation. We describe the algorithm completely, discuss time complexity issues, and present an experimental evaluation of its performance and comparison with the zoom-out algorithm. The theoretical and empirical analyses demonstrate the advantages of our algorithm when applied to various types of formal contexts. INDEX TERMS Formal concept analysis, concept lattice, granularity levels of attributes, classification of concepts.

show abstract

“…Gene selection, according to biologists, results in more compact gene sets, which lowers diagnostics costs and makes it easier to comprehend the roles of linked genes [2]. In the high-dimensional space of a small number of observations, comparing gene expression profiles and picking those that are best related with the examined forms of data is a difficult issue in pattern recognition, which can be tackled utilizing specialized data mining approaches [3]. Despite the rapid advancements in this subject, there is always a need for further understanding and research development.…”

Section: Introductionmentioning

confidence: 99%

Hybrid gene selection method based on mutual information technique and dragonfly optimization algorithm

Mahmood

Karyakos

Yacoob

2021

EEJET

View full text Add to dashboard Cite

One of the most prevalent problems with big data is that many of the features are irrelevant. Gene selection has been shown to improve the outcomes of many algorithms, but it is a difficult task in microarray data mining because most microarray datasets have only a few hundred records but thousands of variables. This type of dataset increases the chances of discovering incorrect predictions due to chance. Finding the most relevant genes is generally the most difficult part of creating a reliable classification model. Irrelevant and duplicated attributes have a negative impact on categorization algorithms’ accuracy. Many Machine Learning-based Gene Selection methods have been explored in the literature, with the aim of improving dimensionality reduction precision. Gene selection is a technique for extracting the most relevant data from a series of datasets. The classification method, which can be used in machine learning, pattern recognition, and signal processing, will benefit from further developments in the Gene selection technique. The goal of the feature selection is to select the smallest subset of features but carrying as much information about the class as possible. This paper models the gene selection approach as a binary-based optimization algorithm in discrete space, which directs binary dragonfly optimization algorithm «BDA» and verifies it in a chosen fitness function utilizing precision of the dataset’s k-nearest neighbors’ classifier. The experimental results revealed that the proposed algorithm, dubbed MI-BDA, in terms of precision of results as measured by cost of calculations and classification accuracy, it outperforms other algorithms

show abstract

Data mining methods for gene selection on the basis of gene expression arrays

Cited by 11 publications

References 24 publications

Selection and classification of gene expression in autism disorder: Use of a combination of statistical filters and a GBPSO-SVM algorithm

Selection and classification of gene expression in autism disorder: Use of a combination of statistical filters and a GBPSO-SVM algorithm

An Efficient Algorithm for Decreasing the Granularity Levels of Attributes in Formal Concept Analysis

Hybrid gene selection method based on mutual information technique and dragonfly optimization algorithm

Contact Info

Product

Resources

About