In this paper, we successfully apply GEFeS (Genetic & Evolutionary Feature Selection) to identify the key features in the human vaginal microbiome and in patient meta-data that are associated with bacterial vaginosis (BV). The vaginal microbiome is the community of bacteria found in a patient, and meta-data include behavioral practices and demographic information. Bacterial vaginosis is a disease that afflicts nearly one third of all women, but the current diagnostics are crude at best. We describe two types of classifies for BV diagnosis, and show that each is associated with one of two treatments. Our results show that the classifiers associated with the ‘Treat Any Symptom’ version have better performances that the classifier associated with the ‘Treat Based on N-Score Value’. Our long term objective is to develop a more accurate and objective diagnosis and treatment of BV.
Abstract-Malicious software, also known as malware, is a huge problem that costs consumers billions of dollars each year. To solve this problem, a significant amount of research has been dedicated towards detecting malware. In this paper, we introduce a genetic and evolutionary feature selection technique for the identification of HTML code associated with malware. We believe that there may be an association between malware and the HTML code that it is embedded in. Our results show that this technique outperforms previous techniques in terms of recognition accuracy as well as the total number of features needed for recognition.
Author identification is the process of recognizing an author based on a sample of text. Feature selection is the process of selecting the most salient features required for recognition. In many cases, this results in an increase in recognition accuracy. In this paper, we apply Genetic and Evolutionary Feature Selection with Machine Learning (GEFeS ML ) to author identification. We then introduce Genetic Heuristic Development (GHD), a process to improve the matching process. GHD uses subsets of features found by GEFeS ML to create a high performing heuristic for feature selection.This technique successfully increases recognition accuracy while significantly reducing the number of features required for recognition.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.