2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks - Supplemental Volume (DSN-S) 2022
DOI: 10.1109/dsn-s54099.2022.00012
|View full text |Cite
|
Sign up to set email alerts
|

An industrial perspective on web scraping characteristics and open issues

Abstract: An ongoing battle has been running for more than a decade between e-commerce websites owners and web scrapers. Whenever one party finds a new technique to prevail, the other one comes up with a solution to defeat it. Based on our industrial experience, we know this problem is far from being solved. New solutions are needed to address automated threats. In this work, we will describe the actors taking part in the battle, the weapons at their disposal, and their allies on either side. We will present a real-worl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 8 publications
0
4
0
Order By: Relevance
“…In [27], Li et al show that more than half of the bot ip addresses they collected with their honeypots belong to residential networks. As explained in [16], current detection techniques struggle in blocking bots using such ips.…”
Section: The Contributions Of This Paper Are Twofoldmentioning
confidence: 99%
See 1 more Smart Citation
“…In [27], Li et al show that more than half of the bot ip addresses they collected with their honeypots belong to residential networks. As explained in [16], current detection techniques struggle in blocking bots using such ips.…”
Section: The Contributions Of This Paper Are Twofoldmentioning
confidence: 99%
“…Nowadays, websites in different domains, such as e-commerce, ticketing, and social media, are engaged in a persistent fight against subtle but damaging actors: scraping bots. They produce a significant amount of traffic towards these websites producing large financial losses, as explained in recent works [23,16].…”
Section: Introductionmentioning
confidence: 99%
“…In [1], Chiapponi et al present the impact of residential IP proxies on web scraping campaigns. In [5], the same authors use the semantics of the received queries to group searches issued by different IP addresses, leading to the conclusion that Client Server SUPERPROXY ...…”
Section: Motivationmentioning
confidence: 99%
“…Scraping bots are a plague for online companies. They continuously query websites, increasing the costs for their owners without generating any revenue [1]. Commercial antibot solutions exist to counter this threat.…”
Section: Introductionmentioning
confidence: 99%