Amount of information in the form of online news needs to be balanced with the ability of readers to sort or classify subjective or objective news. So that a special system is needed that can be used for online news objectivity classification so that it can help readers to pick up subjective or objective news. This research proposes the development of techniques in machine learning to help sort out news objectivity automatically based on the content of the news. The algorithm proposed is K-Nearest Neighbor (KNN) algorithm. News samples obtained from kompas.com by scrapping occur imbalance classes where the number of objective news and subjective news are not balanced. So that it can affect the performance of the classification algorithm. One technique to overcome the imbalance class is to apply the Synthetic Minority Over-sampling Technique (SMOTE) technique.. SMOTE is the generation of minority data as much as the majority data. This study compares the performance of KNN algorithm without SMOTE and the performance of KNN algorithm with SMOTE. Based on the results of the study by applying a variety of neighboring k values, namely 1, 3, 5, 7 and 9, it was found that the application of SMOTE could improve the accuracy of the KNN algorithm at values k = 1 and k = 3 with an average increase of 3.36. At values k 5, 7 and 9 the algorithm experiences an average decrease in accuracy of 6.67.
People now trying to maximizing function of Virtual Learning Environment. Virtual Learning Environment, not only as a place to help learning system but now has become a place of learning itself. But with the change of the learning system, teacher now have difficulty to monitor the activity of the student and the learning material. Although there is data that is considered capable become a benchmark for students and the interaction with Virtual Learning Activity. This paper will make a data prediction using Naïve bayes and C4.5 Algorithm using the Web History data and the sum of webpage interaction of the students in Virtual Learning Environment.
CNN originates from image processing and is not commonly known as a forecasting technique in time-series analysis which depends on the quality of input data. One of the methods to improve the quality is by smoothing the data. This study introduces a novel hybrid exponential smoothing using CNN called Smoothed-CNN (S-CNN). The method of combining tactics outperforms the majority of individual solutions in forecasting. The S-CNN was compared with the original CNN method and other forecasting methods such as Multilayer Perceptron (MLP) and Long Short-Term Memory (LSTM). The dataset is a year time-series of daily website visitors. Since there are no special rules for using the number of hidden layers, the Lucas number was used. The results show that S-CNN is better than MLP and LSTM, with the best MSE of 0.012147693 using 76 hidden layers at 80%:20% data composition.
AbstrakHati merupakan salah satu organ penting dalam tubuh manusia yang berfungsi untuk detoksifikasi racun atau penetral racun dari segala sesuatu yang masuk ke dalam tubuh kita, sehingga tubuh menjadi lebih sehat. Hati dapat terserang suatu penyakit yang mampu mengganggu tugasnya, apabila penyakit hati sudah menyerang maka racun akan tersebar ke seluruh tubuh dan membuat tubuh menjadi tidak sehat. Penyakit liver merupakan penyakit hati yang disebabkan oleh virus, alkohol, pola hidup dan lainnya. Menurut data WHO (World Health Organization) menunjukkan hampir 1,2 juta orang per tahun khususnya di Asia Tenggara dan Afrika mengalami kematian akibat terserang penyakit liver. Seseorang sering tidak menyadari atau terlambat mengetahui penyakit liver sehingga ketika diperiksa penyakit liver sudah parah, akan lebih baik apabila dilakukan penanganan lebih awal dengan mengetahui gejala-gejala yang diderita. Data mining mampu membantu diagnosa penyakit liver dengan lebih mudah terutama untuk membantu para dokter dalam menentukan apakah pasien menderita penyakit liver atau tidak, dengan gejala hampir mendekati penyakit liver. Proses diagnosa penyakit liver dilakukan dengan proses klasifikasi dan hasilnya berupa pasien tersebut menderita liver atau tidak. Penelitian ini menggunakan 4 algoritma data mining yaitu Naïve Bayes, K-Nearest Neighbor (KNN), Decision Tree dan Neural Network. Dataset yang digunakan yaitu Indian Liver Patient Dataset (ILPD) dari website UCI Machine Learning Repository. Keempat algoritma tersebut dibandingkan manakah yang lebih baik akurasinya untuk kasus diagnosa penyakit liver. Hasilnya menunjukkan bahwa algoritma Naïve Bayes memiliki akurasi 55,42%, algoritma K-Nearest Neigbor memiliki akurasi 66,03%, algoritma Decision Tree memiliki akurasi 72,74%, dan algoritma Neural Network memiliki akurasi 69,64%. Akurasi tersebut tergolong rendah karena kelas atau label antara pasien penyakit liver dan pasien tidak memiliki liver tidaklah seimbang, kelas pasien penyakit liver lebih banyak dibandingkan pasien tidak memiliki liver, sehingga banyak data yang diklasifikasikan sebagai pasien penyakit liver.
Sistem manajemen E-learning merupakan bentuk kemajuan teknologi dalam bidang pendidikan dan telah banyak menghasilkan kumpulan data-data pendidikan yang salah satunya adalah data aktivitas pembelajaran siswa dalam sistem manajemen E-learning. Banyaknya data pendidikan yang belum tereksplorasi dengan baik dapat di manfaatkan dengan menggunakan teknik data mining. Pada penelitian ini akan dilakukan perbandingan 3 model data berbeda yaitu data awal tanpa preprocessing dan data yang di preprocessing menggunakan seleksi fitur correlation-based feature selection dan Information Gain. Data yang digunakan adalah data aktivitas pembelajaran siswa dalam sistem manajemen E-learning. Selanjutnya proses pengujian data dengan menggunakan 10 folds cross validation dengan metode C4.5 dan evaluasi data menggunakan confusion matrix. Hasil dari pengujian data menggunakan algoritma C4.5 yang dikombinasikan dengan seleksi fitur correlation-based feature selection menghasilkan nilai akurasi yang lebih tinggi dengan nilai akurasi sebesar 76.92%. Sementara itu hasil dari pengujian data awal tanpa selesksi fitur dan data yang di seleksi fitur menggunakan information gain memiliki nilai akrasi yang sama dengan nilai akurasi sebesar 76.19%. Hal ini dikarenakan data yang diproses menggunakan algoritma C4.5 tanpa preprocessing dan data yang telah di preprocessing menggunakan information gain sama-sama menghitung nilai gain untuk membuat model pohon keputusan, dan menghasilkan model pohon keputusan yang sama. Sehingga hasil dari proses pengujian data memiliki nilai akurasi yang sama.
In Kediri City there is a very popular woven fabric shop called Medali Mas. It has high sales transaction activity resulting in a large stack of data purchases. This data stack is examined as an information pattern for consumer purchases using data mining association rule techniques and FP-Growth algorithms. The FP-Growth algorithm uses the concept of development tree in searching for frequent item sets. The data used are, 26 types of woven fabric items and 200 transaction data provided that 2 or 3 types of items in 1 transaction. Determined minimum support value of 20 percent and minimum confidence value of 10 percent. It also used Chi-Square testing to find out how much correlation between variables from the results of frequent itemsets that have been calculated. The final result of the consumer purchasing pattern is obtained (m to no) when buying Semi sutra Lusi = grey, Pakan = Blue Flowers, then the consumer might buy Sarong Lusi = black, Pakan= green Lurik and Cotton Lusi= yellow, Pakan = Tosca Bamboo with the results of the correlation between variables at 19.1397274913.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.