2021
DOI: 10.1007/s42979-021-00507-w
|View full text |Cite
|
Sign up to set email alerts
|

Real-Time Detection of Dictionary DGA Network Traffic Using Deep Learning

Abstract: Botnets and malware continue to avoid detection by static rule engines when using domain generation algorithms (DGAs) for callouts to unique, dynamically generated web addresses. Common DGA detection techniques fail to reliably detect DGA variants that combine random dictionary words to create domain names that closely mirror legitimate domains. To combat this, we created a novel hybrid neural network, Bilbo the “bagging” model, that analyses domains and scores the likelihood they are generated by such algorit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 44 publications
(13 citation statements)
references
References 30 publications
0
12
0
Order By: Relevance
“…Researches in this regard have turned to leveraging Machine Learning techniques to detect bot activity at different stages of bot infection, i.e propagation, rallying and post-infection behavior. Highnam et al [14] targeted identification of bot malwares that used Domain-Generation-Algorithms (DGA) based domain names for finding its respective C&C. Such malware creates anomalous DNS traffic during the rallying phase. ey leveraged the deterministic nature of such algorithms and trained a deep neural network composed of LSTM, CNN and ANN in order to identify whether a paritcular host was making DNS calls for domains that were DGA generated.…”
Section: Bots and Botnetsmentioning
confidence: 99%
“…Researches in this regard have turned to leveraging Machine Learning techniques to detect bot activity at different stages of bot infection, i.e propagation, rallying and post-infection behavior. Highnam et al [14] targeted identification of bot malwares that used Domain-Generation-Algorithms (DGA) based domain names for finding its respective C&C. Such malware creates anomalous DNS traffic during the rallying phase. ey leveraged the deterministic nature of such algorithms and trained a deep neural network composed of LSTM, CNN and ANN in order to identify whether a paritcular host was making DNS calls for domains that were DGA generated.…”
Section: Bots and Botnetsmentioning
confidence: 99%
“…The system ran over a single CPU of 2.00 GHz, and it only needed 3.3% of its capacity per device to be monitored, with a false positive rate of about 0.13% . Highnam et al 22 developed the Bilbo the"bagging" model, which combined two neural networks, a CNN and a LSTM, to determine whether a URL is legitimate or generated with a DGA. In their experimentation on four hours of real traffic, Bilbo discovered five potential botnets that commercial tools did not warn about.…”
Section: State Of the Artmentioning
confidence: 99%
“…The following subsections provide more details on the ML models in Section 3.1, on the DL models in Section 3.2, on other methods in Section 3.3, and on the datasets used in the reviewed studies in Section 3.4. [37] RNN Alexa/DGArchive (63 DGAs), Bambenek (11 DGAs) Koh and Rhodes [38] LSTM OpenDNS/Bader, Abakumov Tran et al [39] LSTM.MI Alexa/Bambenek (37 DGAs) Vinayakumar et al [40] LSTM, GRU, IRNN, RNN, CNN, hybrid (CNN-LSTM) Alexa, OpenDNS/Bambenek, Bader (17 DGAs) Xu et al [41] CNN-based Alexa/DGArchive (16 DGAs) Yu et al [42] LSTM, BiLSTM, stacked CNN, parallel CNN, hybrid (CNN-LSTM) Alexa/Bambenek Akarsh et al [43] LSTM OpenDNS, Alexa/20 public DGAs Qiao et al [44] LSTM Alexa/Bambenek Liu et al [45] Hybrid (BiLSTM-CNN) Alexa/Netlab (50 DGAs), Bambenek (30 DGAs) Ren et al [46] CNN, LSTM, CNN-BiLSTM, ATT-CNN-BiLSTM, SVM Alexa/Bambenek, Netlab (19 DGAs) Sivaguru et al [31] hybrid (RF-LSTM.MI) Alexa, private/DGArchive Vij et al [47] LSTM Alexa/11 DGAs Cucchiarelli et al [34] BiLSTM, LSTM.MI, hybrid (CNN-BiLSTM) Alexa/Netlab (25 DGAs) Highnam et al [48] hybrid (CNN-LSTM-ANN) Alexa/DGArchive (3 DGAs) Namgung et al [49] CNN, LSTM, BiLSTM, hybrid (CNN-BiLSTM) Alexa/Bambenek Yilmaz et al [50] LSTM Majestic/DGArchive (68 DGAs) [53] 2020 Alexa/various Yan et al [54] 2020 Passive DNS data/public blacklists Yin et al [55] 2020 Alexa/Bader (19 DGAs)…”
Section: Literature Reviewmentioning
confidence: 99%
“…Highnam et al [48] use a hybrid CNN-LSTM-ANN model. The output of the embedding layer is passed to separate LSTM and CNN models in parallel.…”
Section: Hybrid Cnn-rnn Modelsmentioning
confidence: 99%