2017 IEEE Symposium on Computers and Communications (ISCC) 2017
DOI: 10.1109/iscc.2017.8024632
|View full text |Cite
|
Sign up to set email alerts
|

Spark-based Streamlined Metablocking

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(8 citation statements)
references
References 5 publications
0
8
0
Order By: Relevance
“…For WNP and CNP, special care is taken to avoid redundant comparisons in the restructured blocks, while Reciprocal WNP and CNP [115] apply an aggressive pruning that retains edges satisfying the pruning criteria in both adjacent node neighborhoods. WNP and WEP are combined through the weighted sum of their thresholds in [9]. Note that the notion of Meta-blocking covers established methods that were previously considered as Block Building ones.…”
Section: Comparison Cleaningmentioning
confidence: 99%
See 1 more Smart Citation
“…For WNP and CNP, special care is taken to avoid redundant comparisons in the restructured blocks, while Reciprocal WNP and CNP [115] apply an aggressive pruning that retains edges satisfying the pruning criteria in both adjacent node neighborhoods. WNP and WEP are combined through the weighted sum of their thresholds in [9]. Note that the notion of Meta-blocking covers established methods that were previously considered as Block Building ones.…”
Section: Comparison Cleaningmentioning
confidence: 99%
“…This strategy maximizes the efficiency of WEP and CEP, requiring just 2 and 3 jobs, respectively. Its adaptation to Apache Spark is presented in [9]. (iii) The entity-based strategy is independent of the blocking graph.…”
Section: Comparison Cleaningmentioning
confidence: 99%
“…Araújo et al [59] have proposed a novel schema-agnostic pruning strategy called Global Weighted Node Pruning (GWPN) that combines a local threshold with a global one. The local threshold is computed for each profile as for the WNP, while the global one is computed as the average of all the edges weights.…”
Section: Related Workmentioning
confidence: 99%
“…This strategy aims to discard the edges with a low weight that connects only profiles with a very low local threshold. Compared to traditional WNP, GWNP improves precision of 0.01%, while achieving the same recall, on DBpedia dataset [59]. Araújo et al also discuss a Spark implementation for their strategy, which is based on the MapReduce parallel meta-blocking proposed in [13], and suffers from the same limitations (see Section 4.2.2).…”
Section: Related Workmentioning
confidence: 99%
“…For this reason, the development of efficient incremental blocking techniques able to handle streaming data appears as an open problem [13,15]. To improve efficiency and provide resources for incremental blocking techniques, parallel processing can be applied [16]. Parallel processing distributes the computational costs (i.e., to block entities) among the various resources (e.g., computers or virtual machines) of a computational infrastructure to reduce the overall execution time of blocking techniques.…”
Section: Introductionmentioning
confidence: 99%