TF-IDF Method and Vector Space Model Regarding the Covid-19 Vaccine on Online News

Zen, Bita Parga; Susanto, Irwan; Finaliamartha, Dian

doi:10.33395/sinkron.v6i1.11179

Cited by 6 publications

(3 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Relevansi dokumen terhadap sebuah query diukur berdasarkan kesamaan antara vektor dokumen dan vektor query. Pembobotan TF-IDF dan VSM digunakan untuk mewakili nilai numerik dokumen sehingga memungkinkan perhitungan kedekatan antara dokumen-dokumen (Zen et al, 2021). Semakin dekat dua vektor dalam VSM, maka semakin mirip dua dokumen yang diwakili oleh vektor tersebut.…”

Section: Visualisasi Data Menggunakan Vector Spaceunclassified

Metode Vector Space Model Untuk Web Scraping Pada Website Freelance

Nurkholis,

Fernando,

Ans

2023

j. inti nm

View full text Add to dashboard Cite

Abstract— In digitalization era, internet is at the center of all lines of community activity, just like the field of work. Currently, many platforms provide job vacancies, especially for freelancers. To obtain this information, users usually need to open several websites to find information about suitable job vacancies. Web scraping offers solution to overcome these problems. Based on research that has been done, the BeautifulSoup and Selenium libraries will be used to collect data. To search for data, vector space model method is used to find the level of data similarity between the query and the document. In exploring data, the average near-perfect recall value is 98%, while the average precision value is 56%. This is because data search uses three parameters, so the possibility of retrieving irrelevant data is more significant if the document contains a word in the user's query, even though the context does not match. Utilizing the Streamlit framework in Python can display the data processing results and help users navigate the web scraping process, data processing, and data search. This study aims to implement the web scraping method to retrieve data from freelance websites: Freelance, Project, and Sribulancer. By applying the vector space model method, users can search data from several websites without opening freelance websites one by one. Using data visualization in the form of a web application using the Streamlit framework, the web scraping results can also be processed to be presented in a more helpful form and save the user's time

show abstract

Section: Visualisasi Data Menggunakan Vector Spaceunclassified

Metode Vector Space Model Untuk Web Scraping Pada Website Freelance

Nurkholis,

Fernando,

Ans

2023

j. inti nm

View full text Add to dashboard Cite

show abstract

“…Sehingga data yang dibuat dapat dimasukan sebagai dataset dalam bentuk list [20]. Token kemudian dimasukan kedalam sebuah variabel dan diubah menjadi sebuah Tensor berisikan angka yang diproses melalui sebuah algortime [21]. Perhitungan bobot kata dimulai dengan menghitung nilai TF dengan bobot masing masing kata adalah 1, dan IDF dihitung dengan cara 𝑇𝐹 − 𝐼𝐷𝐹(𝑡𝑘, 𝑑𝑗) = 𝑇𝐹(𝑡𝑘, 𝑑𝑗) * 𝐼𝐷𝐹(𝑡𝑘, 𝑑𝑗)…”

Section: G Tf-idf Tokenizingunclassified

Analisis Sentimen Masyarakat Terhadap Vaksinasi Covid-19 di Twitter Menggunakan Metode Random Forest Classifier (Studi Kasus: Vaksin Sinovac)

Aldean

Paradise²,

Nugraha³

2022

INISTA

View full text Add to dashboard Cite

Pada tahun 2019 terjadi musibah yang melanda berbagai negara didunia termasuk Indonesia. Terjadi penyebaran virus secara cepat dan menyeluruh, yaitu Virus Covid-19. Kasus Covid-19 pertama di dunia terdeteksi di Kota Wuhan, Provinsi Hubei, China. Penyakit ini disebabkan oleh virus sindrom pernafasan akut Coronavirus 2 (SARS-CoV-2. Indonesia sendiri sudah melalukan langkah vaksinasi untuk Virus Covid-19, dengan menggunakan beberapa jenis vaksin yang salah satunya adalah Vaksin Sinovac. Program vaksinasi yang dilakukan di Indonesia menuai banyak pro dan kontra khususnya dari masyarakat. Banyak dari masyakarat yang menyampaikan pendapatnya melalui media sosial berbasis teks, salah satu sosial media yang sering digunakan adalah Twitter. Sehingga sentimen masyarakat yang terdapat di media sosial dapat menjadi tolak ukur bagaimana informasi melalui media sosial yang diterima oleh masyakarat adalah hal positif ataupun hal negatif, sehingga dapat dievaluasi bersama. Pada penelitian ini dibuat sebuah metode machine learning untuk menganalisis sentimen masyarakat pada program vaksinasi menggunakan Vaksin Sinovac. Penelitian ini menggunakan tweet sebanyak 1500 data dengan pembagian 2 kategori yaitu positif dan negatif. Pengoalahan data yang digunakan pada penelitian ini adalah dengan menggunakan Algoritme TF-IDF serta penyeimbangan data menggunakan SMOTE. Model yang dibuat akan dilatih dengan Algoritme Random Forest Classifier dan akan divalidasi menggunakan K-fold Cross Validation dan Confusion Matrix. Hasil pada penelitian ini adalah sentimen masyarakat terhadap Vaksinasi Sinovac adalah positif dan model dapat memprediksi sentiment sebuah tweet dengan akurasi mencapai 79% dan nilai Precision sebesar 85%, Recall sebesar 90% dan F1 Score sebesar 88%..

show abstract

“…The Bag of Words model, which employs Term Frequency -Inverse Document Frequency (TF-IDF), is a feature extraction technique that measures the significance of a word in a document by considering its relationship with other words in the document and assigning a weight to each word [19], [20]. In text mining, TF-IDF is a weighting factor that reflects the importance of a word [21]. The value of TF-IDF increases as the word frequency in the document increases, but it is reduced by the frequency of words in the entire corpus.…”

Section: Introductionmentioning

confidence: 99%

Implementation of n-gram Methodology to Analyze Sentiment Reviews for Indonesian Chips Purchases in Shopee E-Marketplace

Purbaya¹,

Rakhmadani²,

Arum³

et al. 2023

J. RESTI (Rekayasa Sist. Teknol. Inf.)

View full text Add to dashboard Cite

Chips are a well-known product among Small and Medium Enterprises (SMEs). In order to enhance the quality of chips as an SME product, sentiment analysis is a crucial step. In this research, sentiment analysis of chip purchases on the Shopee E-marketplace was conducted using the Natural Language Processing (NLP) method, utilizing the N-Gram Model and Term Frequent-Inverse Document Frequency (TF-IDF) as feature extraction techniques, and the Support Vector Machine (SVM) algorithm for sentiment classification. The objective of this research is to identify the most suitable feature extraction model and optimal SVM kernel type from the options of Linear, Polynomial degree, Gaussian RBF, and Sigmoid kernels. Results from the experiments indicate that the TF-IDF and unigram feature extraction techniques offer the best performance for SVM classification when utilizing the Linear kernel. By labeling the dataset, it was observed that using a lexicon-based approach for sentiment classification resulted in 84.31% of the total reviews being positive. The words "price", "cheap" and "quality" in unigram have the highest weights above 0.040. In the unigram model, linear kernel accuracy and precision performance values are 88.4% and 87.3%. At the same time, the recall performance values is 88.4%. The results of the F1-Score assessment matrix from Unigram were 86.9%, Bigram was 78.5% and Trigram was 77.4%. Ultimately, the unigram model combined with a linear kernel in the SVM algorithm demonstrates strong potential for application in the development of various systems focused on detecting user reviews in the Indonesian language on the Shopee E-Marketplace.

show abstract

TF-IDF Method and Vector Space Model Regarding the Covid-19 Vaccine on Online News

Cited by 6 publications

References 6 publications

Metode Vector Space Model Untuk Web Scraping Pada Website Freelance

Metode Vector Space Model Untuk Web Scraping Pada Website Freelance

Analisis Sentimen Masyarakat Terhadap Vaksinasi Covid-19 di Twitter Menggunakan Metode Random Forest Classifier (Studi Kasus: Vaksin Sinovac)

Implementation of n-gram Methodology to Analyze Sentiment Reviews for Indonesian Chips Purchases in Shopee E-Marketplace

Contact Info

Product

Resources

About