<p class="Abstract"><span lang="EN-GB">In this competitive scenario of the educational system, the higher education institutes use data mining tools and techniques for academic improvement of the student performance and to prevent drop out. The authors collected data from three colleges of Assam, India. The data consists of socio-economic, demographic as well as academic information of three hundred students with twenty-four attributes. Four classification methods, the J48, PART, Random Forest and Bayes Network Classifiers were used. The data mining tool used was WEKA. The high influential attributes were selected using the tool. The internal assessment attribute in the continuous evaluation process makes the highest impact in the final semester results of the students in our dataset. The results showed that random forest outperforms the other classifiers based on accuracy and classifier errors. Apriori algorithm was also used to find the association rule mining among all the attributes and the best rules were also displayed.<em></em></span></p>
Introduction Big Data is the data that are difficult to store, manage, and analyze using traditional database and software techniques. Big Data includes high volume and velocity, and also variety of data that needs for new techniques to deal with it. Intrusion detection system (IDS) is hardware or software monitor that analyzes data to detect any attack toward a system or a network. Traditional intrusion detection system techniques make the system more complex and less efficient when dealing with Big Data, because its analysis properties process is complex and take a long time. The long time it takes to analyze the data makes the system prone to harms for some period of time before getting any alert [1, 2]. Therefore, using Big Data tools and techniques to analyze and store data in intrusion detection system can reduce computation and training time. The IDS has three methods for detecting attacks; Signature-based detection, Anomaly-based detection, and Hybrid-based detection. The signature-based detection is designed to detect known attacks by using signatures of those attacks. It is an effective method of detecting known attacks that are preloaded in the IDS database. Therefore, it is often considered to be much more accurate at identifying an intrusion attempt of Abstract Recently, the huge amounts of data and its incremental increase have changed the importance of information security and data analysis systems for Big Data. Intrusion detection system (IDS) is a system that monitors and analyzes data to detect any intrusion in the system or network. High volume, variety and high speed of data generated in the network have made the data analysis process to detect attacks by traditional techniques very difficult. Big Data techniques are used in IDS to deal with Big Data for accurate and efficient data analysis process. This paper introduced Spark-Chi-SVM model for intrusion detection. In this model, we have used ChiSqSelector for feature selection, and built an intrusion detection model by using support vector machine (SVM) classifier on Apache Spark Big Data platform. We used KDD99 to train and test the model. In the experiment, we introduced a comparison between Chi-SVM classifier and Chi-Logistic Regression classifier. The results of the experiment showed that Spark-Chi-SVM model has high performance, reduces the training time and is efficient for Big Data.
In the study of content authentication and tamper detection of digital text documents, there are very limited techniques available for content authentication of text documents using digital watermarking techniques. A novel intelligent text zero watermarking approach based on probabilistic patterns has been proposed in this paper for content authentication and tamper detection of English text documents. In the proposed approach, Markov model of order THREE and letter-based was constructed and abbreviated as LNMZW3 for text analysis and utilizes the interrelationship between contents of given text to generate the watermark. However, we can extract this watermark later using extraction and detection algorithms to identify the status of text document such as authentic, or tampered. The proposed approach was implemented using PHP Programming language with Net Beans IDE 7.0. Furthermore, the effectiveness and feasibility of our LNMZW3 approach has proved and compared with other recent approaches with experiments using five datasets of varying lengths and different volumes of attacks. Results show that the proposed approach is always detects tampering attacks occurred randomly on text even when the tampering volume is low, mid or high. Comparative results with the recent approaches shows that the our LNMZW3 approach provides added value under random insertion and deletion attacks in terms of performance, watermark robustness and watermark security. However, it is provide worst enhancement under reorder attacks.
As a result of the rapid changes in information and communication technology (ICT), the world has become a small village where people from all over the world connect with each other in dialogue and communication via the Internet. Also, communications have become a daily routine activity due to the new globalization where companies and even universities become global residing cross countries' borders. As a result, translation becomes a needed activity in this connected world. ICT made it possible to have a student in one country take a course or even a degree from a different country anytime anywhere easily. The resulted communication still needs a language as a means that helps the receiver understands the contents of the sent message. People need an automated translation application because human translators are hard to find all the times, and the human translations are very expensive comparing to the translations automated process. Several types of research describe the electronic process of the Machine-Translation. In this paper, the authors are going to study some of these previous researches, and they will explore some of the needed tools Alsohybe et al.; CJAST, 23(4): 1-19, 2017; Article no.CJAST.36124 2 for the Machine-Translation. This research is going to contribute to the Machine-Translation area by helping future researchers to have a summary for the Machine-Translation groups of research and to let lights on the importance of the translation mechanism. Original Research Article
ABSTRACT:The amount of textual information available on the web is estimated by terra bytes. Therefore constructing a software program to summarize web pages or electronic documents would be a useful technique. Such technique would speed up of reading, information accessing and decision making process. This paper investigates a graph based centrality algorithm on Arabic text summarization problem (ATS). The graph based algorithm depends on extracting the most important sentences in a documents or a set of documents (cluster). The algorithm starts computing the similarity between two sentences and evaluating the centrality of each sentence in a cluster based on centrality graph. Then the algorithm extracts the most important sentences in the cluster to include them in a summary. The algorithm is implemented and evaluated by human participants and by an automatic metrics. Arabic NEWSWIRE-a corpus is used as a data set in the algorithm evaluation. The result was very promising. General Terms:AI Applications, NLP, Text Mining and AI Algorithms
The acceleration in telecommunication needs leads to many groups of research, especially in communication facilitating and Machine Translation fields. While people contact with others having different languages and cultures, they need to have instant translations. However, the available instant translators are still providing somewhat bad Arabic-English Translations, for instance when translating books or articles, the meaning is not totally accurate. Therefore, using the semantic web techniques to deal with the homographs and homonyms semantically, the aim of this research is to extend a model for the ontology-based Arabic-English Machine Translation, named NAN, which simulate the human way in translation. The experimental results show that NAN translation is approximately more similar to the Human Translation than the other instant translators. The resulted translation will help getting the translated texts in the target language somewhat correctly and semantically more similar to human translations for the Non-Arabic Natives and the Non-English natives.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.