With the abundance of exceptionally High Dimensional data, feature selection has become an essential element in the Data Mining process. In this paper, we investigate the problem of efficient feature selection for classification on High Dimensional datasets. We present a novel filter based approach for feature selection that sorts out the features based on a score and then we measure the performance of four different Data Mining classification algorithms on the resulting data. In the proposed approach, we partition the sorted feature and search the important feature in forward manner as well as in reversed manner, while starting from first and last feature simultaneously in the sorted list. The proposed approach is highly scalable and effective as it parallelizes over both attribute and tuples simultaneously allowing us to evaluate many of potential features for High Dimensional datasets. The newly proposed framework for feature selection is experimentally shown to be very valuable with real and synthetic High Dimensional datasets which improve the precision of selected features. We have also tested it to measure classification accuracy against various feature selection process.
Recommendation System has been developed with the growth of Word Wide Web. Recently, Web3.0 or Semantic Web has changed the traditional way of its related approaches, by leveraging knowledge of Linked Open data Cloud which consist of domain specific and cross domain interconnected datasets. It fabricates thousands of RDF triples and millions of links (external/internal) to connect this open source data. As per our literature survey we have found that the Recommender System based on Linked Open data Cloud does not deal with this Knowledge Base in an efficient manner because of the problem of data sparsity and inconsistency. which results due to automatic generation of Resource Description Format data from unstructured documents that leads to garbage data have no sense in recommending. This paper aims to explore a hybrid recommender which can be used as a rating predictor as well as movie recommender of RDF datasets. Also. we present a new model for Recommender System that not only utilizes DBpedia Knowledge Base but also remove the former problems in Recommender System by using a preprocessing technique for sparsity removal. To prove the correctness and accuracy of our model we have implemented and tested it over other previous methodologies. In order to make our algorithm efficient. we also used different data structure for storing and processing.
The goal of this project is to use the Semantic Web Technologies and Data Mining for disease diagnosis to assist health care professionals regarding the possible medication and drug to prescribe (Drug recommendation) according to the features of the patient. Numerous Decision Support Systems (DSS) and Expert Systems allow medical collaboration, like in the differential diagnosis specific or general. But, a medical recommendation system using both Semantic Web technologies and Data mining has not yet been developed which initiated this work. However, it should be mentioned that there are several system references about medicine or active ingredient interactions, but their final goal is not the Drug recommendation which uses above technologies. With this project we try to provide an assistant to the doctor for better recommendations. The patient will also able to use this system for explanation of drugs, food interaction and side effects of corresponding drugs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.