Modelling mammalian extinction and forecasting recovery: koalas at Iluka (NSW, Australia)

Increasing progress in numerous research fields and information technologies, led to an increase in the publication of research papers. Therefore, researchers take a lot of time to find interesting research papers that are close to their field of specialization. Consequently, in this paper we have proposed documents classification approach that can cluster the text documents of research papers into the meaningful categories in which contain a similar scientific field. Our presented approach based on essential focus and scopes of the target categories, where each of these categories includes many topics. Accordingly, we extract word tokens from these topics that relate to a specific category, separately. The frequency of word tokens in documents impacts on weight of document that calculated by using a numerical statistic of term frequency-inverse document frequency (TF-IDF). The proposed approach uses title, abstract, and keywords of the paper, in addition to the categories topics to perform the classification process. Subsequently, documents are classified and clustered into the primary categories based on the highest measure of cosine similarity between category weight and documents weights.

show abstract

Data loss prevention (DLP) by using MRSH-v2 algorithm

Ali

Jalal

Al-Obaydy

2020

IJECE

View full text Add to dashboard Cite

Sensitive data may be stored in different forms. Not only legal owners but also malicious people are interesting of getting sensitive data. Exposing valuable data to others leads to severe Consequences. Customers, organizations, and /or companies lose their money and reputation due to data breaches. There are many reasons for data leakages. Internal threats such as human mistakes and external threats such as DDoS attacks are two main reasons for data loss. In general, data may be categorized based into three kinds: data in use, data at rest, and data in motion. Data Loss Prevention (DLP) are good tools to identify important data. DLP can do analysis for data content and send feedback to administrators to make decision such as filtering, deleting, or encryption. Data Loss Prevention (DLP) tools are not a final solution for data breaches, but they consider good security tools to eliminate malicious activities and protect sensitive information. There are many kinds of DLP techniques, and approximation matching is one of them. Mrsh-v2 is one type of approximation matching. It is implemented and evaluated by using TS dataset and confusion matrix. Finally, Mrsh-v2 has high score of true positive and sensitivity, and it has low score of false negative.

show abstract

Cloud computing security for e-learning during COVID-19 pandemic

Najm

Alsamaraee²,

Jalal³

2022

IJEECS

View full text Add to dashboard Cite

The demand for e-learning services increased during the developments of the COVID-19 virus and its rapid spread, and the recommendations of the World Health Organization (WHO) that social distancing should be required. The rapid transition to the e-learning environment quickly led to the neglect of some security aspects, which led to an increase in cyber attacks targeting computer accounts, which is one of the most important pillars of e-learning. In these papers, the attacks that target the cloud computer used in the most important e-learning have been studied and classified according to the victim using an inductive methodology based on global statistics related to cyber attacks and recent research. And suggest appropriate solutions to avoid its occurrence in the near future and raise the level of protection for those computer clouds.

show abstract

Effectiveness evaluation of machine learning algorithms for breast cancer prediction

et al. 2022

View full text Add to dashboard Cite

Breast cancer is becoming a global epidemic, affecting predominantly women. As a result, the number of people diagnosed with breast cancer is increasing every day. As a result, it is critical to have certain early detection methods in place that can assist patients in recognizing this condition at an early stage. Therefore, they might begin taking their medication to prevent the sickness from killing them. Different prediction approaches for early diagnosis of such diseases have been created in the machine learning fields. Those algorithms employ a variety of computational classifiers and claim to achieve satisfactory results in a few areas. However, no research was reached to determine which computationally sophisticated approach is more effective in detecting breast cancer. As a result, it is necessary to select the most effective strategy from the available options. This paper makes a contribution to the performance evaluation of 12 alternative classification strategies on datasets of breast cancer. The right explanations for the classifiers' dominance were investigated.

show abstract

Text Mining: Design of Interactive Search Engine Based Regular Expressions of Online Automobile Advertisements

Jalal

2020

Int. J. Eng. Ped.

View full text Add to dashboard Cite

Technology world has greatly evolved over the past decades, which led to inflated data volume. This progress of technology in the digital form generated scattered texts across millions of web pages. Unstructured texts contain a vast amount of textual data. Discover of useful and interesting relations from unstructured texts requires more processing by computers. Therefore, text mining and information extraction have become an exciting research field to get structured and valuable information. This paper focuses on text pre-processing of automotive advertisements domains to configure a structured database. The structured database was created by extract the information over unstructured automotive advertisements, which is an area of natural language processing. Information extraction deals with finding factual information in text using learning regular expressions. We manually craft rule-based specific approaches to extract structured information from unstructured web pages. Structured information will be provided by user-friendly search engine designed for topic-specific knowledge. Consequently, this information that extracted from these advertisements uses to perform a structured search over certain interesting attributes. Thus, the tuples are assigned a probability and indexed to support the efficiency of extraction and exploration via user queries.

show abstract

Big data and intelligent software systems

Jalal¹

2018

KES

View full text Add to dashboard Cite

Document classification using term frequency-inverse document frequency and K-means clustering

Al-Obaydy

Hashim

Najm

et al. 2022

IJEECS

View full text Add to dashboard Cite

Increased advancement in a variety of study subjects and information technologies, has increased the number of published research articles. However, researchers are facing difficulties and devote a significant time amount in locating scientific research publications relevant to their domain of expertise. In this article, an approach of document classification is presented to cluster the text documents of research articles into expressive groups that encompass a similar scientific field. The main focus and scopes of target groups were adopted in designing the proposed method, each group include several topics. The word tokens were separately extracted from topics related to a single group. The repeated appearance of word tokens in a document has an impact on the document's weight, which is computed using the term frequency-inverse document frequency (TF-IDF) numerical statistic. To perform the categorization process, the proposed approach employs the paper's title, abstract, and keywords, as well as the categories' topics. We exploited the K-means clustering algorithm for classifying and clustering the documents into primary categories. The K-means algorithm uses category weights to initialize the cluster centers (or centroids). Experimental results have shown that the suggested technique outperforms the k-nearest neighbors algorithm in terms of accuracy in retrieving information.

show abstract

Engineering Mining a Large Scale Data Based on Feature Engineering, Metadata, and Ontologies

Jalal¹,

Altun²

2016

IJDIWC

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ahmed Adeeb Jalal

Text documents clustering using data mining techniques

Data loss prevention (DLP) by using MRSH-v2 algorithm

Cloud computing security for e-learning during COVID-19 pandemic

Effectiveness evaluation of machine learning algorithms for breast cancer prediction

Text Mining: Design of Interactive Search Engine Based Regular Expressions of Online Automobile Advertisements

Big data and intelligent software systems

Document classification using term frequency-inverse document frequency and K-means clustering

Engineering Mining a Large Scale Data Based on Feature Engineering, Metadata, and Ontologies

Contact Info

Product

Resources

About