Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS 2008) 2008
DOI: 10.1109/hicss.2008.157
|View full text |Cite
|
Sign up to set email alerts
|

Examining Variations of Prominent Features in Genre Classification

Abstract: This paper investigates the correlation between features of three types (visual, stylistic and topical

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2009
2009
2016
2016

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(20 citation statements)
references
References 18 publications
0
20
0
Order By: Relevance
“…Levering, et al (2008) used four related genres (store home pages, store product lists, store product descriptions, and other store pages). Kim and Ross (2008) focused on PDF web documents and a collection of six genres (academic monograph, business report, book of fiction, minutes, periodicals, and thesis). Such focused genre palettes provide a well-defined testing ground for genre identification but in a limited scope.…”
Section: Previous Workmentioning
confidence: 99%
“…Levering, et al (2008) used four related genres (store home pages, store product lists, store product descriptions, and other store pages). Kim and Ross (2008) focused on PDF web documents and a collection of six genres (academic monograph, business report, book of fiction, minutes, periodicals, and thesis). Such focused genre palettes provide a well-defined testing ground for genre identification but in a limited scope.…”
Section: Previous Workmentioning
confidence: 99%
“…Vidulin et al [27] [15] used image, style, and textual features to classify PDF documents by genre. The image features were extracted from the visual layout of the first page of the PDF document.…”
Section: Previous Work On Genre Classification Of Web Pagesmentioning
confidence: 99%
“…This dataset was built between 2005 and 2008 by Kim and Ross [15]. It consists of 6494 PDF documents labeled independently by two kinds of people (students and secretaries).…”
Section: ) Krys-imentioning
confidence: 99%
“…In [13], a collection of 20 genre labels: Adult, Blog, Children's, Commercial, Community, Content delivery, Entertainment, Error message, FAQ, Gateway, Index, Informative, Journalistic, Official, Personal, Poetry, Prose fiction, Scientific, Shopping and User input were allowing multiple labels to be assigned to one webpage. Although there are certain resemblances in genre corpora (e.g.…”
Section: Related Workmentioning
confidence: 99%
“…Home pages and non-home pages are distinguished in [4] and classified as personal home pages, corporate home pages and organizational home pages genres. [13] Focused on PDF web documents and a collection of six genres: academic monograph, business report, book of fiction, minutes, periodicals, and thesis. Issues related to corpus, features and classification algorithms discussed in [14] drawn conclusions from experiments on a focused genre palette may be misleading since features found to be useful in some genres are not equally effective in other genres.…”
Section: Related Workmentioning
confidence: 99%