2022
DOI: 10.48550/arxiv.2201.12416
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Discovering Exfiltration Paths Using Reinforcement Learning with Attack Graphs

Abstract: Reinforcement learning (RL), in conjunction with attack graphs and cyber terrain, are used to develop reward and state associated with determination of optimal paths for exfiltration of data in enterprise networks. This work builds on previous crown jewels (CJ) identification that focused on the target goal of computing optimal paths that adversaries may traverse toward compromising CJs or hosts within their proximity. This work inverts the previous CJ approach based on the assumption that data has been stolen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(5 citation statements)
references
References 18 publications
0
5
0
Order By: Relevance
“…Another line of research focuses on developing more specific penetration testing tasks. A number of authors define more specific tasks by reward engineering and other modifications to the MDP including formulations of capture the flag [22], crown jewel analysis [16], and discovering exfiltration paths [17]. This paper extends this line of research with a methodology for exposing SDR.…”
Section: Related Workmentioning
confidence: 97%
See 4 more Smart Citations
“…Another line of research focuses on developing more specific penetration testing tasks. A number of authors define more specific tasks by reward engineering and other modifications to the MDP including formulations of capture the flag [22], crown jewel analysis [16], and discovering exfiltration paths [17]. This paper extends this line of research with a methodology for exposing SDR.…”
Section: Related Workmentioning
confidence: 97%
“…Hu et al extend the use of the CVSS by proposing to use exploitability scores weight rewards [14]. Gangupantulu et al [15], [16] and Cody et al [17] explicitly extend the methods of Hu et al with concepts of terrain. Gangupantulu et al advocate defining models of terrain in terms of the rewards and transition probabilities of MDPs, first in the case of firewalls as obstacles [15], then in the case of lateral pivots nearby key terrain [16].…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations