2004
DOI: 10.1007/978-3-540-30116-5_19
|View full text |Cite
|
Sign up to set email alerts
|

Document Classification Through Interactive Supervision of Document and Term Labels

Abstract: Effective incorporation of human expertise, while exerting a low cognitive load, is a critical aspect of real-life text classification applications that is not adequately addressed by batch-supervised highaccuracy learners. Standard text classifiers are supervised in only one way: assigning labels to whole documents. They are thus deprived of the enormous wisdom that humans carry about the significance of words and phrases in context. We present HIClass, an interactive and exploratory labeling package that act… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
28
0

Year Published

2004
2004
2018
2018

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 38 publications
(29 citation statements)
references
References 8 publications
0
28
0
Order By: Relevance
“…By the creation of one-term pseudo-documents, Godbole et al [60] coerce the notion of feature label uncertainty into a more traditional instance uncertainty framework for text classification tasks. By incorporating the label information on each feature value with unlabeled examples, Druck et al…”
Section: Feature-based Learning and Active Dual Supervisionmentioning
confidence: 99%
“…By the creation of one-term pseudo-documents, Godbole et al [60] coerce the notion of feature label uncertainty into a more traditional instance uncertainty framework for text classification tasks. By incorporating the label information on each feature value with unlabeled examples, Druck et al…”
Section: Feature-based Learning and Active Dual Supervisionmentioning
confidence: 99%
“…Godbole et al [16] present a document annotation method through interactive supervision of document and term labels. Actually their method is based on active learning of SVM algorithm, which actively collects user opinion on feature representations as well as whole-document labels to minimize the user's annotation efforts.…”
Section: Hierarchy Generationmentioning
confidence: 99%
“…Among them, the most commonly used method is active learning [14,34,45,16]. In active learning, the machine prompts the most informative document for the user to label.…”
Section: Introductionmentioning
confidence: 99%
“…HIClass consists of roughly 5000 lines of C++ code for the backend which communicates through XML with 1000 lines of PHP scripts to manage browser-based front-end user interactions [8].…”
Section: Description Of the Demonstrationmentioning
confidence: 99%
“…We extend active learning to include feature engineering and multi-labeled document labeling conversations. HIClass is an interactive multi-class multi-labeled text classification system that combines the cognitive power of humans with the power of automated learners to make statistically sound classification decisions (details appear in [8]). …”
Section: Introductionmentioning
confidence: 99%