2012
DOI: 10.1177/0165551511435969
|View full text |Cite
|
Sign up to set email alerts
|

Web robot detection based on pattern-matching technique

Abstract: In web robot detection it is important is to find features that are common characteristics of diverse robots, in order to differentiate between them and humans. Existing approaches employ fairly simple features (e.g. empty referrer field, interval between successive requests), which often fail to reflect web robots' behaviour accurately. False alarms may therefore occur unacceptably often. In this paper we propose a fresh approach that expresses the behaviour of interactive users and various web robots in term… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
0

Year Published

2012
2012
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(9 citation statements)
references
References 10 publications
0
9
0
Order By: Relevance
“…Early detection techniques are based on syntactical log analysis (Kabe & Miyazaki, ; Community, ) and continue to serve as a useful way for identifying web robots (Huntington et al, ) that are known and have recognizable ip addresses and user‐agent strings. Traffic pattern analysis: Traffic‐based analysis techniques search for statistical contrasts between the characteristics of robot and human traffic. The methods find contrasts according to fixed expectations about robot and human behaviors (Jansen, Spink, & Saracevic, ; Guo et al, ; Geens et al, ; Lin, Quan, & Wu, ; Duskin & Feitelson, ; Hayati, Potdar, Talevski, & Smyth, ; Kwon, Kim, & Cha, ; Kwon et al, ; Bai, Xiong, Zhao, & He, ). For example, a traffic analysis technique may check how similar a session's navigational pattern is to a depth‐first or breadth‐first search of the hyperlinks of a site ‐ a pattern that an analyst may assume robot sessions would exhibit. Analytical learning techniques: Analytical learning techniques exploit the observed characteristics of the logged sessions to estimate the likelihood that a given session was generated by a robot with a machine learning algorithm (Doran & Gokhale, ).…”
Section: Perspective On Web Robot Detectionmentioning
confidence: 99%
See 2 more Smart Citations
“…Early detection techniques are based on syntactical log analysis (Kabe & Miyazaki, ; Community, ) and continue to serve as a useful way for identifying web robots (Huntington et al, ) that are known and have recognizable ip addresses and user‐agent strings. Traffic pattern analysis: Traffic‐based analysis techniques search for statistical contrasts between the characteristics of robot and human traffic. The methods find contrasts according to fixed expectations about robot and human behaviors (Jansen, Spink, & Saracevic, ; Guo et al, ; Geens et al, ; Lin, Quan, & Wu, ; Duskin & Feitelson, ; Hayati, Potdar, Talevski, & Smyth, ; Kwon, Kim, & Cha, ; Kwon et al, ; Bai, Xiong, Zhao, & He, ). For example, a traffic analysis technique may check how similar a session's navigational pattern is to a depth‐first or breadth‐first search of the hyperlinks of a site ‐ a pattern that an analyst may assume robot sessions would exhibit. Analytical learning techniques: Analytical learning techniques exploit the observed characteristics of the logged sessions to estimate the likelihood that a given session was generated by a robot with a machine learning algorithm (Doran & Gokhale, ).…”
Section: Perspective On Web Robot Detectionmentioning
confidence: 99%
“…• Traffic pattern analysis: Traffic-based analysis techniques search for statistical contrasts between the characteristics of robot and human traffic. The methods find contrasts according to fixed expectations about robot and human behaviors (Jansen, Spink, & Saracevic, 2000;Guo et al, 2005;Geens et al, 2006;Lin, Quan, & Wu, 2008;Duskin & Feitelson, 2009;Hayati, Potdar, Talevski, & Smyth, 2010;Kwon, Kim, & Cha, 2012a;Kwon et al, 2012b;Bai, Xiong, Zhao, & He, 2014). For example, a traffic analysis technique may check how similar a session's navigational pattern is to a depth-first or breadth-first search of the hyperlinks of a site -a pattern that an analyst may assume robot sessions would exhibit.…”
Section: Perspective On Web Robot Detectionmentioning
confidence: 99%
See 1 more Smart Citation
“…Hidden links methods can easily capture web robots, but for some of the more intelligent robots or robots only for specific file such as video, picture 3rd International Conference on Material, Mechanical and Manufacturing Engineering (IC3ME 2015) are often not achieve very good detection results. c) access feature analysis methods [3,4,5] . These methods identify robot access by analyzing the characteristics of Web access to find the different of person and program [6] .…”
Section: Introductionmentioning
confidence: 99%
“…9 They send requests to web servers to procure resources. 10 Some robots are developed with malicious intent and are designed to download entire websites for the purpose of copying the site, 11 for autonomous logins to send spam, 12 or for autonomous logins to steal confidential or copyright protected material. 13 Web robots specifically designed for the illegal procurement of copyright protected content are obviously of particular concern for libraries.…”
Section: Introductionmentioning
confidence: 99%