Automatic Document Classification Part II . Additional Experiments

Borko, Harold; Bernick, Myrna

doi:10.1145/321217.321219

Cited by 26 publications

(10 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Comparison between categorization methods would be aided by the use of common testsets, something which has rarely been done. (An exception is [BB64].) Development of standard collections would be an important first step to better understanding of text categorization.…”

Section: Other Issuesmentioning

confidence: 99%

Evaluating text categorization

Lewis¹

1991

Proceedings of the Workshop on Speech and Natural Language - HLT '91

164

View full text Add to dashboard Cite

While certain standard procedures are widely used for evaluating text retrieval systems and algorithms, the sarne is not true for text categorization. Omission of important data from reports is common and methods of measuring effectiveness vary widely. This ]'ms m~de judging the relative merits of techniques for text categorization dif~.cult and has disguised important research issues. In this paper I discuss a variety of ways of evaluating the effectiveness of text categorization systems, drawing both on reported categorization experiments and on methods used in evaluating query-driven retrieval. I also consider the extent to which the same evaluation methods may be used with systems for text extraction, a more complex task. In evaluatlng either kind of system, the purpose for which the output is to be used is crucial in choosing appropriate evaluation methods.

show abstract

Section: Other Issuesmentioning

confidence: 99%

Evaluating text categorization

Lewis¹

1991

Proceedings of the Workshop on Speech and Natural Language - HLT '91

164

View full text Add to dashboard Cite

show abstract

“…For example, research in information retrieval as early as 1963 used Factor Analysis (FA) on text documents to extract topics and automatically classify documents [5,6]. Whilst this work received a lot of attention as an unsupervised approach to document classification, though rarely has it been cited as an example of topic identification.…”

Section: Introductionmentioning

confidence: 99%

Comparison of Latent Dirichlet Modeling and Factor Analysis for Topic Extraction: A Lesson of History

Péladeau¹,

Davoodi²

2018

Proceedings of the 51st Hawaii International Conference on System Sciences

View full text Add to dashboard Cite

show abstract

“…It can be deduced from the mathematical notation and diagrammatic representation of Automatic Text Classification (ATC) that the definition by Borko and Bernick (1963) [6] is extending the first definition, definition by Merkl (1998) [7] is extending the definition by Borko and Bernick (1963) [6] and definition by Manning and Schutze(1999) [8] is the union of definition by Merkl (1998) [7] and definition by Borko and Bernick (1963) [6].…”

Section: Discussionmentioning

confidence: 99%

“…Automatic Text Classification (ATC) can be defined as automatic identification of such a set of categories "definition by Borko and Bernick(1963)" [6].…”

Section: 12definition (Ii)mentioning

confidence: 99%

An Elaboration of Text Categorization and Automatic Text Classification Through Mathematical and Graphical Modelling

Faraz¹

2015

CSEIJ

View full text Add to dashboard Cite

As the time goes on and on, digitization of text has been increasing enormously and the need to organize, categorize and classify text has become indispensable. Disorganization and very little categorization and classification of text may result in slower response time of text or information retrieval. Therefore it is very important and essential to organize, categorize and classify texts and digitized documents according to definitions proposed by text mining experts and computer scientists. Work has been done on Text Mining, Text Categorization and Automatic Text Classification by computer and information scientists, but obviously a lot of space for novel research in this domain is available. In this paper we have proposed the mathematical notation and graphical models for Text Mining, Text Categorization and Automatic Text Classification to get in depth understanding of these techniques and concepts. Introduction and proposal of mathematical and graphical models for Text Mining, Text Categorization and Automatic Text Classification will shorten the response time of text and information retrieval. Also the performance of web search engines can be improved so much by employing these mathematical and graphical models.

show abstract

Automatic Document Classification Part II . Additional Experiments

Cited by 26 publications

References 4 publications

Evaluating text categorization

Evaluating text categorization

Comparison of Latent Dirichlet Modeling and Factor Analysis for Topic Extraction: A Lesson of History

An Elaboration of Text Categorization and Automatic Text Classification Through Mathematical and Graphical Modelling

Contact Info

Product

Resources

About