Using HTML Tags to Improve Parallel Resources Extraction

Feng, Yujian; Yan, Hong; Tang, Wei; Yao, Jianmin; Zhu, Qiaoming

doi:10.1109/ialp.2011.23

Cited by 5 publications

(1 citation statement)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A crawler is a program for which we specify a seed URL, and keep going based on that URL retrieving connected pages. 1 A page is parsed for additional URLs by URL Normalization, where these URLs are saved in storage for crawling [27], [28], and are used to retrieve the more available Web pages from a Web server. The process of crawling may be divided among multiple distributed crawlers.…”

Section: Methodsmentioning

confidence: 99%

Topical search engine for Internet of Things

Faqeeh

Al‐Ayyoub

Wardat

et al. 2014

2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA)

View full text Add to dashboard Cite

Internet of Things (IoT) has become a common buzzword nowadays in the Web. However, there is no search tool currently in place for discovering and learning about the different types of IoT elements. Hence, this paper presents a topical search engine for IoT. The motivation for a topical search engine comes from the relatively poor performance of generalpurpose search engines, which depend on the results of generic Web crawlers. The topical search engine is a system that learns the specialization from examples, and then explores the Web, guided by a relevance and popularity rating mechanism. The results show that the proposed topical search engine outperforms other general search engines.

show abstract

Section: Methodsmentioning

confidence: 99%