Ashour Ali scite author profile

Ashour Ali

1Publication

0Citation Statements Received

38Citation Statements Given

How they've been cited

How they cite others

Affiliations

National University of Malaysia

Publications

Order By: Most citations

A BERT-Based model: Improving Crime News Documents Classification through Adopting Pre-trained Language Models

Ali

Noah

Zakaria

2023

Preprint

View full text Add to dashboard Cite

Text classification has played a key role in various fields, such as news classification, spam detection, and sentiment analysis. However, the classification of crime news continues to pose challenges, including low efficiency, low precision, and the scarcity of high-quality annotated data on a large scale. Using pre-trained language models, such as Bidirectional Encoder Representation from Transformers (BERT), has reduced the need for extensive amounts of labelled data in the categorization process. BERT boasts strong abilities in contextual representation and excels in text classification tasks, particularly when limited labelled data is present. A BERT-based pre-trained language model was applied to categorize crimes using information gathered from Malaysian online newspapers to overcome the shortage of high-quality, large-scale crime-related labelled data. The crime-related labelled dataset used for training this model was compiled from BERNAMA (Malaysian National News Agency) and manually labelled by crime investigation experts into 12 categories, including a non-crime class. The experiment results showed that the BERT-based model outperformed previous models and achieved the highest performance with an accuracy of 99.45%. This highlights the efficacy of BERT in classifying crime news, even with a small dataset.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ashour Ali

A BERT-Based model: Improving Crime News Documents Classification through Adopting Pre-trained Language Models

Contact Info

Product

Resources

About