2010
DOI: 10.1007/978-3-642-15364-8_1
|View full text |Cite
|
Sign up to set email alerts
|

Vi-DIFF: Understanding Web Pages Changes

Abstract: Abstract. Nowadays, many applications are interested in detecting and discovering changes on the web to help users to understand page updates and more generally, the web dynamics. Web archiving is one of these fields where detecting changes on web pages is important. Archiving institutes are collecting and preserving different web site versions for future generation. A major problem encountered by archiving systems is to understand what happened between two versions of web pages. In this paper, we address this… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2011
2011
2014
2014

Publication Types

Select...
2
2
2

Relationship

3
3

Authors

Journals

citations
Cited by 9 publications
(10 citation statements)
references
References 13 publications
0
10
0
Order By: Relevance
“…We also extract some features from the difference tree returned by the VI-DIFF algorithm [4] that detects some operations between the VIPS structures of versions, e.g. insertions, deletions or updates of VIPS blocks, or even a boolean value returning whether two versions have the same VIPS structure.…”
Section: Structural Descriptorsmentioning
confidence: 99%
See 1 more Smart Citation
“…We also extract some features from the difference tree returned by the VI-DIFF algorithm [4] that detects some operations between the VIPS structures of versions, e.g. insertions, deletions or updates of VIPS blocks, or even a boolean value returning whether two versions have the same VIPS structure.…”
Section: Structural Descriptorsmentioning
confidence: 99%
“…Most archivists only take into account the Web page source code (code string, DOM tree...) [2] and not the visual rendering [3,4,1]. However, the code may not be sufficient to describe the content of Web pages, e.g.…”
Section: Introductionmentioning
confidence: 99%
“…Changes between two page versions are detected by the Vi-DIFF algorithm [14]. First, Vi-DIFF extends a visual segmentation algorithm to partition the web page into multiple blocks.…”
Section: • Importance Of a Versionmentioning
confidence: 99%
“…The larger the number of significant changes occurred inside important blocks is, the higher the estimated importance of changes. For more details about the algorithm Vi-DIFF used to detect changes between two versions of pages, please refer to [14]. The estimator of the importance of changes is detailed in [2].…”
Section: • Importance Of a Versionmentioning
confidence: 99%
“…The first step of the analyzer consists on segmenting each captured pages into blocks that describe the hierarchical structure of the page. Then, successive versions of a same page are compared to detect structural 1 and content 2 changes by using Vi-DIFF algorithm [12]. Afterwards, the importance of changes between two successive versions is evaluated based on the estimator proposed in [1].…”
Section: Pattern-based Archivingmentioning
confidence: 99%