In high-dimensional quantitative structure-activity relationship (QSAR) studies, identifying relevant molecular descriptors is a major goal. In this study, a proposed penalized method is used as a tool for molecular descriptors selection. The method, called adjusted adaptive least absolute shrinkage and selection operator (LASSO) (AALASSO), is employed to study the high-dimensional QSAR prediction of the anticancer potency of a series of imidazo[4,5-b]pyridine derivatives. This proposed penalized method can perform consistency selection and deal with grouping effects simultaneously. Compared with other commonly used penalized methods, such as LASSO and adaptive LASSO with different initial weights, the results show that AALASSO obtains the best predictive ability not only by consistency selection but also by encouraging grouping effects in selecting more correlated molecular descriptors. Hence, we conclude that AALASSO is a reliable penalized method in the field of high-dimensional QSAR studies.
Feature selection is a well-known prepossessing procedure, and it is considered a challenging problem in many domains, such as data mining, text mining, medicine, biology, public health, image processing, data clustering, and others. This paper proposes a novel feature selection method, called AOAGA, using an improved metaheuristic optimization method that combines the conventional Arithmetic Optimization Algorithm (AOA) with the Genetic Algorithm (GA) operators. The AOA is a recently proposed optimizer; it has been employed to solve several benchmark and engineering problems and has shown a promising performance. The main aim behind the modification of the AOA is to enhance its search strategies. The conventional version suffers from weaknesses, the local search strategy, and the trade-off between the search strategies. Therefore, the operators of the GA can overcome the shortcomings of the conventional AOA. The proposed AOAGA was evaluated with several well-known benchmark datasets, using several standard evaluation criteria, namely accuracy, number of selected features, and fitness function. Finally, the results were compared with the state-of-the-art techniques to prove the performance of the proposed AOAGA method. Moreover, to further assess the performance of the proposed AOAGA method, two real-world problems containing gene datasets were used. The findings of this paper illustrated that the proposed AOAGA method finds new best solutions for several test cases, and it got promising results compared to other comparative methods published in the literature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.