Proceedings of the 2008 C3S2E Conference on - C3S2E '08 2008
DOI: 10.1145/1370256.1370278
|View full text |Cite
|
Sign up to set email alerts
|

An enhanced web robot for the CINDI system

Abstract: Focused web crawlers are getting increasing attention as an effective approach for digital library construction. CINDI Robot is a focused web crawler digging and collecting online academic and scientific documents in computer science and software engineering field for the CINDI system. In this paper, we present the basic design of CINDI Robot and introduce the interactions of CINDI components.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2009
2009
2021
2021

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 4 publications
0
4
0
Order By: Relevance
“…Concordia Indexing and DIscovering system (CINDI) uses revised context graph and multilevel inspection scheme to discover relevant webpages. It explores relevant resources that are many links away from seed URLs.…”
Section: Current Status Of Web Crawlermentioning
confidence: 99%
“…Concordia Indexing and DIscovering system (CINDI) uses revised context graph and multilevel inspection scheme to discover relevant webpages. It explores relevant resources that are many links away from seed URLs.…”
Section: Current Status Of Web Crawlermentioning
confidence: 99%
“…Although the page content, hierarchy patterns and anchor texts are satisfactory leads, a focused crawler inevitably needs a multi-level inspection infrastructure to compensate their drawbacks. Unfortunately the current papers overlook the power of such comprehensiveness [23]. Considering these shortcomings, our proposed Treasure-Crawler utilized a significant approach in crawling and indexing Web pages that complied with its predefined topic of interest.…”
Section: Discussionmentioning
confidence: 99%
“…The idea of using the context of a given topic to guide the crawling process could significantly increase both precision and recall. [10,11]. The major task of focused Web crawlers is to unveil as many bridges among relevant regions as possible.…”
Section: Introductionmentioning
confidence: 99%
“…The idea of using the context of a given topic to guide the crawling process could significantly increase both precision and recall. Tunneling (2001) is the phenomenon where a crawler reaches some relevant regions (or pages) while traversing a path which does not solely consist of relevant pages [10,11]. The major task of focused Web crawlers is to unveil as many bridges among relevant regions as possible.…”
Section: Introductionmentioning
confidence: 99%