The big data concept has elicited studies on how to accurately and efficiently extract valuable information from such huge dataset. The major problem during big data mining is data dimensionality due to a large number of dimensions in such datasets. This major consequence of high data dimensionality is that it affects the accuracy of machine learning (ML) classifiers; it also results in time wastage due to the presence of several redundant features in the dataset. This problem can be possibly solved using a fast feature reduction method. Hence, this study presents a fast HP-PL which is a new hybrid parallel feature reduction framework that utilizes spark to facilitate feature reduction on shared/distributed-memory clusters. The evaluation of the proposed HP-PL on KDD99 dataset showed the algorithm to be significantly faster than the conventional feature reduction techniques. The proposed technique required 1 minute to select 4 dataset features from over 79 features and 3,000,000 samples on a 3-node cluster (total of 21 cores). For the comparative algorithm, more than 2 hours was required to achieve the same feat. In the proposed system, Hadoop’s distributed file system (HDFS) was used to achieve distributed storage while Apache Spark was used as the computing engine. The model development was based on a parallel model with full consideration of the high performance and throughput of distributed computing. Conclusively, the proposed HP-PL method can achieve good accuracy with less memory and time compared to the conventional methods of feature reduction. This tool can be publicly accessed at https://github.com/ahmed/Fast-HP-PL.
The hospital location selection for COVID-19-infected patients is out to be one of the most critical decisions for healthcare sectors in high-case countries. In this study, optimal urban hospital location selection for COVID-19-infected patients has been done out of multiple alternative locations in city of Baghdad Iraq by introducing a web application system that can find the best site from alternatives by using MEREC and modified technique for order of preference by similarity to ideal solution (TOPSIS) algorithms. MEREC algorithm is utilized to obtain criteria weights and modified TOPSIS for ranking the alternatives. Four criteria are considered with eight alternatives sites. The proposed system has two-part, hardware part (embedded systems) designed by utilizing NEO-6M GPS receiver with ESP8266NodeMCU to obtain coordinate of regions and then, using the HTTP protocol to communicate to submit these data to database server. The second part is the web application developed by PHP, JavaScript, CSS, HTML, and MySQL used to allow the system admin to enter the locations of the alternatives with their criteria into the system to get the best urban hospital location for COVID-19-patients. The results showed effectiveness of overall suggested system and appropriateness of the modified TOPSIS method over the traditional TOPSIS method in ranking the alternative.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.