2007
DOI: 10.1007/s10032-007-0060-2
|View full text |Cite
|
Sign up to set email alerts
|

Genre as noise: noise in genre

Abstract: Given a specific information need, documents of the wrong genre can be considered as noise. From this perspective, genre classification helps to separate relevant documents from noise. Orthographic errors represent a second, finer notion of noise. Since specific genres often include documents with many errors, an interesting question is whether this "micro-noise" can help to classify genre. In this paper we consider both problems. After introducing a comprehensive hierarchy of genres, we present an intuitive m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2008
2008
2021
2021

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 15 publications
(12 citation statements)
references
References 11 publications
0
12
0
Order By: Relevance
“…In addition to these dynamic elements of the genre palette itself, available user data should be exploited to improve classification performance. We proposed a learning algorithm that employs user behavior in a feedback loop to improve classification performance [3]. Several levels of cooperativeness were distinguished and led to different perspectives in the utilization of available user data.…”
Section: Classifiersmentioning
confidence: 99%
“…In addition to these dynamic elements of the genre palette itself, available user data should be exploited to improve classification performance. We proposed a learning algorithm that employs user behavior in a feedback loop to improve classification performance [3]. Several levels of cooperativeness were distinguished and led to different perspectives in the utilization of available user data.…”
Section: Classifiersmentioning
confidence: 99%
“…. 2.5 Topic-neutral features to represent genres (Finn and Kushmerick, 2006) (Stubbe, Ringlstetter, and Schulz, 2007) 2.7 Popular science sub-genres description (Lieungnapar, Todd, and Trakulkasemsuk, 2017) Text mining roughly concerns knowledge discovery in texts, i.e. the process where Information Retrieval (IR), Natural Language Processing (NLP), and Machine Learning (ML) methods are used for extracting high-level information from texts.…”
Section: Scholarships Acknowledgementsmentioning
confidence: 99%
“…The second option is to adopt the open-set classification setting where it is possible for some web pages not to be classified into any of the predefined genre categories (Stubbe, Ringlstetter, and Schulz, 2007;Pritsos and Stamatatos, 2013). This setup avoids the problem of class imbalance caused by numerous noisy pages and also avoids the problem of handling a diverse and highly heterogeneous class.…”
Section: Closed-set Vs Open-set Classificationmentioning
confidence: 99%
See 2 more Smart Citations