Grzegorz Baron scite author profile

Typically discretisation procedures are implemented as a part of initial pre-processing of data, before knowledge mining is employed. It means that conclusions and observations are based on reduced data, as usually by discretisation some information is discarded. The paper presents a different approach, with taking advantage of discretisation executed after data mining. In the described study firstly decision rules were induced from real-valued features. Secondly, data sets were discretised. Using categories found for attributes, in the third step conditions included in inferred rules were translated into discrete domain. The properties and performance of rule classifiers were tested in the domain of stylometric analysis of texts, where writing styles were defined through quantitative attributes of continuous nature. The performed experiments show that the proposed processing leads to sets of rules with significantly reduced sizes while maintaining quality of predictions, and allows to test many data discretisation methods at the acceptable computational costs.

show abstract

Standard vs. non-standard cross-validation: evaluation of performance in a space with structured distribution of datapoints

Baron

Stańczyk

2021

Procedia Computer Science

View full text Add to dashboard Cite

Analysis of Multiple Classifiers Performance for Discretized Data in Authorship Attribution

Baron

2017

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Grzegorz Baron

Influence of Data Discretization on Efficiency of Bayesian Classifier for Authorship Attribution

On Approaches to Discretization of Datasets Used for Evaluation of Decision Systems

Discretisation of conditions in decision rules induced for continuous data

Standard vs. non-standard cross-validation: evaluation of performance in a space with structured distribution of datapoints

Analysis of Multiple Classifiers Performance for Discretized Data in Authorship Attribution

Contact Info

Product

Resources

About