Stemming is the process of separating words from their affixes to get a basic word. Stemming is generally used when preprocessing in text-based applications. Indonesian Stemming has developed research which is divided into two types, namely, stemming without dictionaries and stemming using dictionaries. Stemming without dictionaries has a disadvantage in the results of removal of affixes which are sometimes inappropriate so that it results in over stemming or under stemming, while stemming using dictionaries has a disadvantage during the stemming process which is relatively long and cannot eliminate affixes to compound words. This study proposes a new stemming algorithm without a dictionary that is able to detect legal and illegal affixes in Indonesian using the Finite-State Automata method. The technique used is rule-based Stemmer based on Indonesian language morphology with Regular Expression. Test results were carried out using 118 news documents with 15792 words. The first test results on the autonomy stemmer algorithm obtain the correct word which amounts to 10449 of the total number of words processed, which means getting an average accuracy of 66%. The second test results on the autonomy stemmer algorithm get the results of the average speed of 0.0051 seconds. The third test result is being able to do the elimination of affixes to compound words.
Indonesia has long been known as an agricultural country, one of Indonesia's best agricultural products is tobacco. Tobacco with good quality can be seen from the leaves, in fact the diseases that attack tobacco are of various types which can be seen from the changes in the tobacco leaves starting from the seeding and planting period. In the tobacco growing period, it is divided into two major parts, namely seeding period and the planting period, so that diseases that attack tobacco are also divided into two, namely diseases that attack during seeding and planting. This research is limited to diseases that attack tobacco at the time of planting, because at the time of seedling the tobacco has not yet produced leaves. When tobacco enters the planting period, at this time tobacco leaves begin to form. Good care is needed at this time such as fertilization, nutrition, vitamins, and pest control in order to obtain healthy tobacco so that tobacco is not susceptible to disease. Tobacco that lacks nutritional intake will be susceptible to diseases including fungi, bacteria, and viruses. The disease attack on tobacco has its own characteristics that appear on tobacco leaves. Early detection of the disease is very important so that disease control can be precise and the spread of the disease can be prevented so as not to cause endemic. In this research, an early detection system of tobacco leaf disease based on image processing will be designed. Normalization image, grayscale technique, folllowed by edge detection will be applied in these image so that from here the entropy, energy, and inertia values of the image can be obtained using statistical measures, and the last one using the decision tree classification technique can be classified as uninfected leaf or infected leaf. In this study, feature extraction from images of tobacco leaves that are not infected with the virus using grayscale techniques followed by edge detection produces an average statistical measure with entropy (h) values between 2,341 to 2,676, energy (e) values between 6,112 to 6,665, and inertia values. (i) between 3,322 to 3,576, while for leaves infected with the virus the average value of entropy (h) is between 4,543 to 5,576, the average value of energy (e) is between 12,212 to 13,455, and the average value of inertia (i) between 5,343 to 6,597.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.