Khaled M. Fouad scite author profile

In most conditions, it is a problematic mission for a machine-learning model with a data record, which has various attributes, to be trained. There is always a proportional relationship between the increase of model features and the arrival to the overfitting of the susceptible model. That observation occurred since not all the characteristics are always important. For example, some features could only cause the data to be noisier. Dimensionality reduction techniques are used to overcome this matter. This paper presents a detailed comparative study of nine dimensionality reduction methods. These methods are missing-values ratio, low variance filter, highcorrelation filter, random forest, principal component analysis, linear discriminant analysis, backward feature elimination, forward feature construction, and rough set theory. The effects of used methods on both training and testing performance were compared with two different datasets and applied to three different models. These models are, Artificial Neural Network (ANN), Support Vector Machine (SVM) and Random Forest classifier (RFC). The results proved that the RFC model was able to achieve the dimensionality reduction via limiting the overfitting crisis. The introduced RFC model showed a general progress in both accuracy and efficiency against compared approaches. The results revealed that dimensionality reduction could minimize the overfitting process while holding the performance so near to or better than the original one.

show abstract

Parallelized linear time variant acceleration coefficients of PSO algorithm for global optimization

Fouad

Elsheshtawy

Dawood

2017

View full text Add to dashboard Cite

Advanced methods for missing values imputation based on similarity learning

Fouad

Ismail

Azar

et al. 2021

View full text Add to dashboard Cite

The real-world data analysis and processing using data mining techniques often are facing observations that contain missing values. The main challenge of mining datasets is the existence of missing values. The missing values in a dataset should be imputed using the imputation method to improve the data mining methods’ accuracy and performance. There are existing techniques that use k-nearest neighbors algorithm for imputing the missing values but determining the appropriate k value can be a challenging task. There are other existing imputation techniques that are based on hard clustering algorithms. When records are not well-separated, as in the case of missing data, hard clustering provides a poor description tool in many cases. In general, the imputation depending on similar records is more accurate than the imputation depending on the entire dataset's records. Improving the similarity among records can result in improving the imputation performance. This paper proposes two numerical missing data imputation methods. A hybrid missing data imputation method is initially proposed, called KI, that incorporates k-nearest neighbors and iterative imputation algorithms. The best set of nearest neighbors for each missing record is discovered through the records similarity by using the k-nearest neighbors algorithm (kNN). To improve the similarity, a suitable k value is estimated automatically for the kNN. The iterative imputation method is then used to impute the missing values of the incomplete records by using the global correlation structure among the selected records. An enhanced hybrid missing data imputation method is then proposed, called FCKI, which is an extension of KI. It integrates fuzzy c-means, k-nearest neighbors, and iterative imputation algorithms to impute the missing data in a dataset. The fuzzy c-means algorithm is selected because the records can belong to multiple clusters at the same time. This can lead to further improvement for similarity. FCKI searches a cluster, instead of the whole dataset, to find the best k-nearest neighbors. It applies two levels of similarity to achieve a higher imputation accuracy. The performance of the proposed imputation techniques is assessed by using fifteen datasets with variant missing ratios for three types of missing data; MCAR, MAR, MNAR. These different missing data types are generated in this work. The datasets with different sizes are used in this paper to validate the model. Therefore, proposed imputation techniques are compared with other missing data imputation methods by means of three measures; the root mean square error (RMSE), the normalized root mean square error (NRMSE), and the mean absolute error (MAE). The results show that the proposed methods achieve better imputation accuracy and require significantly less time than other missing data imputation methods.

show abstract

Intelligent approach for large-scale data mining

Fouad

Bably

2020

IJCAT

View full text Add to dashboard Cite

Intelligent system for feature selection based on rough set and chaotic binary grey wolf optimisation

Azar

Anter

Fouad

2020

IJCAT

View full text Add to dashboard Cite

Adaptive E-Learning System Based on Semantic Search and Recommendation in the Arab World

Fouad

Nagdy

Harb

2013

View full text Add to dashboard Cite

The success of any e-learning system depends on the retrieval of relevant learning contents according to the requirement of the learner (user). This leads to the development of the adaptive e-learning system to provide learning materials considering the requirements and understanding capability of the learner. This chapter aims to propose the system of personalized semantic search and recommendation of learning contents on the e-learning Web-based systems. Semantic and personalized search of learning contents is based on expansion the query keywords by using of the semantic relations and reasoning mechanism in the ontology. Personalized recommendation of learning objects is based on the learner profile ontology to guide what learning contents a learner should study. For the Arab world, to achieve the learning for all goals and meet the learner’s requirements, it must build more inclusive, including the personalization services, and has semantic learning content in the learning systems. The authors’ proposed system is efficient, more effective, and more learner-friendly in the Arab sector because it responds to every learner and his needs individually with a timely and precise adaptation of learning materials.

show abstract

Arabic Fake News Detection Using Deep Learning

Fouad¹,

Sabbeh²,

Medhat³

2022

View full text Add to dashboard Cite

Nowadays, an unprecedented number of users interact through social media platforms and generate a massive amount of content due to the explosion of online communication. However, because user-generated content is unregulated, it may contain offensive content such as fake news, insults, and harassment phrases. The identification of fake news and rumors and their dissemination on social media has become a critical requirement. They have adverse effects on users, businesses, enterprises, and even political regimes and governments. State of the art has tackled the English language for news and used feature-based algorithms. This paper proposes a model architecture to detect fake news in the Arabic language by using only textual features. Machine learning and deep learning algorithms were used. The deep learning models are used depending on conventional neural nets (CNN), long short-term memory (LSTM), bidirectional LSTM (BiLSTM), CNN+LSTM, and CNN + BiLSTM. Three datasets were used in the experiments, each containing the textual content of Arabic news articles; one of them is reallife data. The results indicate that the BiLSTM model outperforms the other models regarding accuracy rate when both simple data split and recursive training modes are used in the training process.

show abstract

User Authentication based on Dynamic Keystroke Recognition

Fouad

Hassan

2016

View full text Add to dashboard Cite

Biometric identification is a very good candidate technology, which can facilitate a trusted user authentication with minimum constraints on the security of the access point. However, most of the biometric identification techniques require special hardware, thus complicate the access point and make it costly. Keystroke recognition is a biometric identification technique which relies on the user behavior while typing on the keyboard. It is a more secure and does not need any additional hardware to the access point. This paper presents a developed behavioral biometric authentication method which enables to identify the user based on his Keystroke Static Authentication (KSA) and describes an authentication system that explains the ability of keystroke technique to authenticate the user based on his template profile saved in the database. Also, an algorithm based on dynamic keystroke analysis has been presented, synthesized, simulated and implemented on Field Programmable Gate Array (FPGA). The proposed algorithm is tested on 25 individuals, achieving a False Rejection Rate (FRR) about 4% and a False Acceptance Rate (FAR) about 0%. This performance is reached using the same sampling text for all the individuals. In this paper, two methods are used to implement the proposed approach: method one (H/W based Sorter) and method two (S/W based Sorter) are achieved execution time about 50.653 ns and 9.650 ns, respectively. Method two achieved a lower execution time; the time in which the proposed algorithm is executed on FPGA board, compared to some published results. As the second method achieved a small execution time and area utilization so it is the preferred method to be implemented on FPGA.

show abstract

12 3 4 5 6

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Khaled M. Fouad

The Effect of Different Dimensionality Reduction Techniques on Machine Learning Overfitting Problem

Parallelized linear time variant acceleration coefficients of PSO algorithm for global optimization

Advanced methods for missing values imputation based on similarity learning

Intelligent approach for large-scale data mining

Intelligent system for feature selection based on rough set and chaotic binary grey wolf optimisation

Adaptive E-Learning System Based on Semantic Search and Recommendation in the Arab World

Arabic Fake News Detection Using Deep Learning

User Authentication based on Dynamic Keystroke Recognition

Contact Info

Product

Resources

About