A massive rise in web-based online content today pushes businesses to implement new approaches and resources that might support better navigation, processing, and handling of high-dimensional data. Over the Internet, 90% of the data is unstructured, and there are several approaches through which this data can translate into useful, structured data—classification is one such approach. Classification of knowledge into a good collection of groups is significant and necessary. As the number of machine-readable documents proliferates, automatic text classification is badly needed to classify these documents. Unlabeled documents are categorized into predefined classes of labeled documents using text labeling, a supervised learning technique. This paper reviewed some existing approaches for classifying online news articles and discusses a framework for the automatic classification of online news articles. For achieving high accuracy, different classifiers were tried. Our experimental method achieved 93% accuracy using a Bayesian classifier and present in terms of confusion metrics. ABSTRAK: Peningkatan tinggi pada masa kini pada maklumat dalam talian berasaskan web menyebabkan kaedah baru dalam bisnes telah diguna pakai dan sumber sokongan seperti navigasi, proses, dan pengurusan data berdimensi-tinggi adalah perlu. 90% data di internet adalah data tidak berstruktur, dan terdapat pelbagai kaedah data ini dapat diterjemahkan kepada data berguna, lebih berstruktur — iaitu melalui kaedah klasifikasi. Klasifikasi ilmu kepada koleksi kumpulan baik adalah penting dan perlu. Seperti mana mesin-boleh baca dokumen berkembang pesat, teks klasifikasi automatik juga sangat diperlukan bagi mengklasifikasi dokumen-dokumen ini. Dokumen yang tidak dilabel dikategori sebagai pengelasan pratakrif dokumen berlabel melalui teks label, iaitu teknik pembelajaran berpenyelia. Kajian ini mengkaji semula pendekatan sedia ada bagi artikel berita dalam talian dan membincangkan rangka kerja bagi pengelasan automatik artikel berita dalam talian. Bagi menghasilkan ketepatan yang tinggi, kami menggunakan pelbagai alat klasifikasi. Kaedah eksperimen ini mempunyai ketepatan 93% menggunakan pengelas Bayesian dan data dibentangkan berdasarkan matriks kekeliruan.
In recent years, vast and complex amounts of data are being created and making it difficult for traditional data processing applications to manage them. The coming of the Internet prompted monstrous spike in the volume of information being made and made accessible. World Wide Web consortium W3C and international standardization body of the web spread the Semantic Web. It is an extended form of current web which provide easier way to search, reuse, combine and share information. In the last few years, major businesses corporations have demonstrated interest in incorporating semantic web technology with big data for added value. Indeed this incorporation has some benefits as well; it increases end-users ability to self-manage data from various sources, it on focuses changing business environments and varying user needs and handles concepts and relationships, manages terminology while connecting different data from varied data sources. For Social Network Analysis (SNA) new methods are needed by combining Big Data and Semantic Web technologies as a way to utilize and add capacities to existing frameworks. Moreover, the fast changing business requirements and latest industry culture of Agile Development needs a robust yet flexible solution for Business Intelligence and by using distributed enterprise level ontologies Data Warehousing can be incorporated. This paper is an attempt to focus on effects of incorporating Big Data with Semantic web, how Semantic Web making Big Data smarter, revisit the Big Data and Semantic Web challenges and opportunities, relationship between them and finally we summarizes with future direction of this integration
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.