Typically, textual information is available as unstructured data, which require processing so that data mining algorithms can handle such data; this processing is known as the pre-processing step in the overall text mining process. This paper aims at analyzing the strong impact that the pre-processing step has on most mining tasks. Therefore, we propose a methodology to vary distinct combinations of pre-processing steps and to analyze which pre-processing combination allows high precision. In order to show different combinations of pre-processing methods, experiments were performed by comparing some combinations such as stemming, term weighting, term elimination based on low frequency cut and stop words elimination. These combinations were applied in text and opinion mining tasks, from which correct classification rates were computed to highlight the strong impact of the pre-processing combinations. Additionally, we provide graphical representations from each pre-processing combination to show how visual approaches are useful to show the processing effects on document similarities and group formation (i.e., cohesion and separation).
Although fiber-reinforced composite materials have often been considered as periodic materials in theoretical models, the distribution of fibers is random in real materials. This random distribution of fibers is closely related to their transverse failure behavior. This paper proposes the use of statistical functions which describe random point patterns as a quantification of the dispersion of the transverse failure properties of several carbon fibre reinforced polymers (CFRP). It is shown that the analysis of the K function is the most meaningful for this purpose.
Among the diseases affecting the commercial citrus production, the citrus black spot (CBS) is considered to cause substantial losses. The analyses of particles in suspension in the orchards and collected into a disc have been applied as a preventive action trying to identify the presence of fungus spores before symptom appearance. In this paper, we show the results of several shape analysis methods applied to the fungus, the first step to the aimed computer aided vision system, capable to assist the identification process. Experiments and comparative results among the methods are presented in this paper, showing that better results were obtained applying the curvature method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.