The number of news produced every day is as much as 3 million per day, making readers have many choices in choosing news according to each reader's topic and category preferences. The recommendation system can make it easier for users to choose the news to read. The method that can be used in providing recommendations from the same user is collaborative filtering. Neural collaborative filtering is usually being used for recommendation systems by combining collaborative filtering with neural networks. However, this method has the disadvantage of recommending the similarity of news content such as news titles and content to users. This research wants to develop neural collaborative filtering using sentences BERT. Sentence BERT is applied to news titles and news contents that are converted into sentence embedding. The results of this sentence embedding are used in neural collaboration with item id, user id, and news category. We use a Microsoft news dataset of 50,000 users and 51,282 news, with 5,475,542 interactions between users and news. The evaluation carried out in this study uses precision, recall, and ROC curves to predict news clicks by the user. Another evaluation uses a hit ratio with the leave one out method. The evaluation results obtained a precision value of 99.14%, recall of 92.48%, f1-score of 95.69%, and ROC score of 98%. Evaluation measurement using the hit ratio@10 produces a hit ratio of 74% at fiftieth epochs for neural collaborative with sentence BERT which is better than neural collaborative filtering (NCF) and NCF with news category.
Abstract— HIV adalah virus yang menyerang sistem kekebalan tubuh yang selanjutnya meningkatkan kemampuan tubuh untuk melawan infeksi dan penyakit. Sejarah AIDS Virus HIV dikatakan berasal dari Kinshasa, Republik Demokratik Kongo. Pada saat itu, para ahli percaya bahwa HIV berasal dari spesies simpanse yang ditularkan ke manusia. Pada simpanse, virus tersebut diberi nama Simian Immunodeficiency Virus atau SIV. Sebelum kemudian menyebabkan penularan HIV pada manusia, penularan virus simpanse ini mungkin berasal dari perburuan simpanse untuk diambil dagingnya, kemudian para pemburu tersebut terkena darah hewan yang terinfeksi. Studi oleh Pusat Pencegahan dan Pengendalian Penyakit (CDC) menunjukkan bahwa HIV mungkin telah ditularkan dari simpanse ke manusia sejak akhir 1800-an. Kinshasa adalah kota terbesar di Kongo, kota dengan pertumbuhan tercepat dengan jaringan transportasi yang menjangkau seluruh negeri. Sebuah laporan menyebutkan sejarah di balik penularan HIV AIDS dari Kongo ke seluruh dunia. Maraknya perdagangan seks, pertumbuhan penduduk, dan jarum suntik yang tidak steril di klinik-klinik diduga menjadi penyebab penyebaran virus HIV yang cukup pesat saat itu. Sejarah juga mencatat AIDS kemudian merajalela di Amerika, Eropa, lalu ke seluruh dunia. Untuk memeriksa data yang ada kami menggunakan aplikasi Data Miner untuk memudahkan kami dalam memeriksa data tersebut. masih banyak pasien yang terpapar virus HIV yang artinya masih banyak masyarakat yang tidak sadar akan bahaya virus yang jika tidak segera ditangani virus ini akan memasuki fase akhir yang sangat berbahaya atau yang kita ketahui sebagai AIDS. Keywords—HIV, machine learning, rapid miner
No abstract
In terms of malignant tumors, breast cancer is one of the most prevalent. Breast cancer is a form of cancer that develops in the breast tissue when the surrounding, healthy breast tissue is overtaken by the uncontrollably growing cells in the breast tissue. Several features or patient conditions can be used in a machine learning approach to predict breast cancer. Machine learning will be utilized in these situations to determine if the cancer is malignant or benign. The Wisconsin Breast Cancer (Diagnostic) Data Set, which contains 32 characteristics and 569 collected data, was the dataset used in this research.. Feature selection in this study is done by eliminating outliers using the upper and lower quartile of each feature then feature selection is also carried out on features that have features that have a high variance inflation factor. The machine learning methods used in this research are Logistic Regression, Random Forest, KNN, SVC, XG Boost, Gradient Boosting, and Ridge Classifier. The selection of this method is based on the target that will be predicted by 2 labels, namely benign cancer, and malignant cancer. The result obtained is that the selection of features using the variance inflation factor increases the accuracy of the previous Logistic Regression and Random Forest methods from 98.25% to 99.12%. The method that has the highest level of accuracy is the Logistic Regression and Random Forest methods which have a value of 99.12%. The next research will be developed by trying other optimization techniques for hyperparameter tuning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.