The existence of information is undeniably needed by many people. This statement describes the increasing importance of information and the corresponding increase in the need for access to relevant documents and literature. The contents of the information derived from these documents are then sorted to make their meaning more understandable. This sorting process is known as stemming. Stemming is a process that is widely applied in basic word searches. Separating meaningless words can make information clearer. It is necessary to pay attention to the appropriate stemming algorithm according to the language used. Many stemming algorithms can be used to perform this basic word search process. Some of them are the Tala and Nazief Adriani algorithms. The two algorithms have differences in their work processes. The Tala algorithm adopts a rule-based Porter algorithm, while the Nazief & Adriani algorithm works based on a dictionary. The two algorithms have their respective advantages in terms of accuracy and speed. Therefore, in this study, an analysis will be carried out by comparing the performance of the two algorithms in the Indonesian language text-stemming process. The trial process uses several different data sources to measure the speed and accuracy of each algorithm. Data sources used in this study included abstracts of student thesis reports or final assignments of 30 students and information from online news as many as 200. From the results of the tests that have been carried out, it can be concluded that the Tala stemming algorithm has a lower accuracy level than Nazief Adriani. The Tala algorithm only has an average accuracy of 65.29%, while Nazief Adriani has an accuracy of 78.47%. Regarding speed, the Tala algorithm has a better speed than Nazief Adriani at 32.19 seconds and Nazief & Adriani at 65.2 seconds.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.