TAGA: Tabu Asexual Genetic Algorithm embedded in a filter/filter feature selection approach for high-dimensional data

Salesi, Sadegh; Cosma, Georgina; Mavrovouniotis, Michalis

doi:10.1016/j.ins.2021.01.020

Cited by 34 publications

(12 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Aside from identifying features relevant to the classification problem, feature selection reduces the dimensionality of the dataset, simplifying the problem, thereby improving model stability and generalisability. Feature selection was performed using a filter-based tabu asexual genetic algorithm (TAGA) 28 and the knowledge of clinical experts.…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

Predicting surgical outcomes for chronic exertional compartment syndrome using a machine learning framework with embedded trust by interrogation strategies

Houston

Cosma

Turner³

et al. 2021

Sci Rep

Self Cite

View full text Add to dashboard Cite

Chronic exertional compartment syndrome (CECS) is a condition occurring most frequently in the lower limbs and often requires corrective surgery to alleviate symptoms. Amongst military personnel, the success rates of this surgery can be as low as 20%, presenting a challenge in determining whether surgery is worthwhile. In this study, the data of 132 fasciotomies for CECS was analysed and using combinatorial feature selection methods, coupled with input from clinicians, identified a set of key clinical features contributing to the occupational outcomes of surgery. Features were utilised to develop a machine learning model for predicting return-to-work outcomes 12-months post-surgery. An AUC of 0.85 ± 0.08 was achieved using a linear-SVM, trained using 6 features (height, mean arterial pressure, pre-surgical score on the exercise-induced leg pain questionnaire, time from initial presentation to surgery, and whether a patient had received a prior surgery for CECS). To facilitate trust and transparency, interrogation strategies were used to identify reasons why certain patients were misclassified, using instance hardness measures. Model interrogation revealed that patient difficulty was associated with an overlap in the clinical characteristics of surgical outcomes, which was best handled by XGBoost and SVM-based models. The methodology was compiled into a machine learning framework, termed AITIA, which can be applied to other clinical problems. AITIA extends the typical machine learning pipeline, integrating the proposed interrogation strategy, allowing to user to reason and decide whether to trust the developed model based on the sensibility of its decision-making.

show abstract

Section: Resultsmentioning

confidence: 99%

“…A Tabu Asexual Genetic Algorithm (TAGA) 28 was used to generate 9 feature sets (Supplementary Table S3 ). TAGA takes an m × n matrix, where m is the number of samples and n is the number of features and calculates a Fisher’s score.…”

Section: Methodsmentioning

confidence: 99%

Predicting surgical outcomes for chronic exertional compartment syndrome using a machine learning framework with embedded trust by interrogation strategies

Houston

Cosma

Turner³

et al. 2021

Sci Rep

Self Cite

View full text Add to dashboard Cite

show abstract

“…Tables 13 and 14 describe the comparison of experimental results between HFIA and the feature selection method mentioned in [ 56 ].…”

Section: Experiments and Discussionmentioning

confidence: 99%

An Efficient Hybrid Feature Selection Method Using the Artificial Immune Algorithm for High-Dimensional Data

Zhu

2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

Feature selection provides the optimal subset of features for data mining models. However, current feature selection methods for high-dimensional data also require a better balance between feature subset quality and computational cost. In this paper, an efficient hybrid feature selection method (HFIA) based on artificial immune algorithm optimization is proposed to solve the feature selection problem of high-dimensional data. The algorithm combines filter algorithms and improves clone selection algorithms to explore the feature space of high-dimensional data. According to the target requirements of feature selection, combined with biological research results, this method introduces the lethal mutation mechanism and the Cauchy operator to improve the search performance of the algorithm. Moreover, the adaptive adjustment factor is introduced in the mutation and update phases of the algorithm. The effective combination of these mechanisms enables the algorithm to obtain a better search ability and lower computational costs. Experimental comparisons with 19 state-of-the-art feature selection methods are conducted on 25 high-dimensional benchmark datasets. The results show that the feature reduction rate for all datasets is above 99%, and the performance improvement for the classifier is between 5% and 48.33%. Compared with the five classical filtering feature selection methods, the computational cost of HFIA is lower than the two of them, and it is far better than these five algorithms in terms of the feature reduction rate and classification accuracy improvement. Compared with the 14 hybrid feature selection methods reported in the latest literature, the average winning rates in terms of classification accuracy, feature reduction rate, and computational cost are 85.83%, 88.33%, and 96.67%, respectively.

show abstract

“…Firstly, the collected annual report text data are preprocessed, and then unigrams, bigrams, and trigrams are extracted as text features by using word bag model and word frequency reverse document frequency (TF-IDF) weighting method. Because text features naturally face high-dimensional problems, high-dimensional text features may contain some redundant and irrelevant features [12]. erefore, the information gain method is further used to filter the extracted initial text features, and the important features are retained to ensure the quality of the features.…”

Section: Feature Extraction Of Financial Risk Predictionmentioning

confidence: 99%

Bank Financial Risk Prediction Model Based on Big Data

Peng

Lin²,

Wu³

2022

Scientific Programming

View full text Add to dashboard Cite

Financial risk prediction is an important technique to systematically predict the unforeseeable risks in banking systems. The issues involving ill-timing and low accuracy in the current risks prediction methods necessitate an effective risk prediction method. Akin to the use of big data in various domains, the technology has a significant role in financial services and can be used to accurately and timely predict the possibilities of risks. In this paper, an effective hybrid method is proposed to aptly and effectively predict financial risks in the banking systems. The method utilizes the Lasso and linear regression algorithms via the big data features and framework technologies. By proper formalization of the bank financial risk problems, the risk data is obtained and processed. To filter the initial text features and preprocess the annual report text data, the information gain method is used. With the Bag-of-Words (BoW) and the word frequency reverse document frequency weighting method, the text features of financial risk prediction are extracted. The bank financial risk prediction model is constructed based on weighted fusion adaptive random subspace algorithm. The prediction results obtained are integrated so as to realize the bank financial risks in a seamless way. The experimental results show that the proposed method can effectively improve the prediction accuracy and consumes comparatively lesser time in risk prediction.

show abstract

TAGA: Tabu Asexual Genetic Algorithm embedded in a filter/filter feature selection approach for high-dimensional data

Cited by 34 publications

References 35 publications

Predicting surgical outcomes for chronic exertional compartment syndrome using a machine learning framework with embedded trust by interrogation strategies

Predicting surgical outcomes for chronic exertional compartment syndrome using a machine learning framework with embedded trust by interrogation strategies

An Efficient Hybrid Feature Selection Method Using the Artificial Immune Algorithm for High-Dimensional Data

Bank Financial Risk Prediction Model Based on Big Data

Contact Info

Product

Resources

About