2008 20th IEEE International Conference on Tools With Artificial Intelligence 2008
DOI: 10.1109/ictai.2008.119
|View full text |Cite
|
Sign up to set email alerts
|

Profile-Based Focused Crawler for Social Media-Sharing Websites

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0
1

Year Published

2009
2009
2022
2022

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(11 citation statements)
references
References 11 publications
0
10
0
1
Order By: Relevance
“…Web 2.0. The advent of the user-generated content philosophy and the participatory culture that was brought by Web 2.0 sites such as blogs, forums and social media, formed a new generation of specialized crawlers that focused on forum [29][30][31][32][33][34], blog/microblog [35,36], and social media [37][38][39][40] spidering. The need for specialized crawlers for these websites emerged from the quality and creation rate of content usually found in forums/blogs, the well-defined structure that is inherent in forums/blogs that makes it possible to even develop frameworks for creating blog crawlers [41], and the implementation particularities that make other types of crawlers inappropriate or inefficient for the task.…”
Section: Usage Typologymentioning
confidence: 99%
“…Web 2.0. The advent of the user-generated content philosophy and the participatory culture that was brought by Web 2.0 sites such as blogs, forums and social media, formed a new generation of specialized crawlers that focused on forum [29][30][31][32][33][34], blog/microblog [35,36], and social media [37][38][39][40] spidering. The need for specialized crawlers for these websites emerged from the quality and creation rate of content usually found in forums/blogs, the well-defined structure that is inherent in forums/blogs that makes it possible to even develop frameworks for creating blog crawlers [41], and the implementation particularities that make other types of crawlers inappropriate or inefficient for the task.…”
Section: Usage Typologymentioning
confidence: 99%
“…To this end, we proposed a DOM path string-based method for page classification that was reported elsewhere [12]. This paper is organized as follows.…”
Section: Introductionmentioning
confidence: 99%
“…Knowing this structure enables a crawler to prioritize some types of pages (e.g., recent user-generated content) over others, or to spread its e ort evenly to obtain a representative sample [4,6,15], or to avoid downloading the same page via di erent URLs [26]. Information about how the site is organized is provided manually, recognized by heuristics, or learned by recognizing consistent pa erns in the site [4,13,15,25,28].…”
Section: Introductionmentioning
confidence: 99%