2017
DOI: 10.17706/jsw.12.3.180-188
|View full text |Cite
|
Sign up to set email alerts
|

The Automatic Extraction of Web Information Based on Regular Expression

Abstract: Based on search engine , this paper built a Web information retrieval matching and structure extraction model. And realized the algorithm of locating and automatically extracting multi-web Baidu news information. Getting the standard mathematical expression of URLs by analyzing the search results URLs and analyzing the DOM tree structure of web pages, this article designed the key tags regular expression. Finally, the method of multi-page location retrieval and structured extraction based on search engine is r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(1 citation statement)
references
References 3 publications
0
0
0
Order By: Relevance
“…Numerous modifications of the keyword-matching method have been applied, including regular expressions [51], string matching algorithms [52], or pattern matching [53,54], all of which can be used to carry out the matching process. The algorithm analyzes each keyword in the text to determine whether it is used alone or as a component of a longer phrase.…”
Section: Baselinesmentioning
confidence: 99%
“…Numerous modifications of the keyword-matching method have been applied, including regular expressions [51], string matching algorithms [52], or pattern matching [53,54], all of which can be used to carry out the matching process. The algorithm analyzes each keyword in the text to determine whether it is used alone or as a component of a longer phrase.…”
Section: Baselinesmentioning
confidence: 99%