2022
DOI: 10.48550/arxiv.2204.06885
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Shedding New Light on the Language of the Dark Web

Abstract: The hidden nature and the limited accessibility of the Dark Web, combined with the lack of public datasets in this domain, make it difficult to study its inherent characteristics such as linguistic properties. Previous works on text classification of Dark Web domain have suggested that the use of deep neural models may be ineffective, potentially due to the linguistic differences between the Dark and Surface Webs. However, not much work has been done to uncover the linguistic characteristics of the Dark Web. T… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 17 publications
(32 reference statements)
0
2
0
Order By: Relevance
“…With the advances in NLP, there has been considerable research into the field of AA that has demonstrated the success of TFIDF-based clustering and classification techniques (Agarwal et al, 2019;İzzet Bozkurt et al, 2007), CNNs (Rhodes, 2015;Shrestha et al, 2017), RNNs (Zhao et al, 2018;Jafariakinabad et al, 2019;Gupta et al, 2019), and transformers architectures (Fabien et al, 2020;Ordoñez et al, 2020;Uchendu et al, 2020a). Moreover, researchers have also observed a significant difference in language structure between Darknet and Surface net websites (Choshen et al, 2019;Jin et al, 2022). Therefore, exploring the application of authorship tasks on the Darknet language is crucial.…”
Section: Related Researchmentioning
confidence: 99%
See 1 more Smart Citation
“…With the advances in NLP, there has been considerable research into the field of AA that has demonstrated the success of TFIDF-based clustering and classification techniques (Agarwal et al, 2019;İzzet Bozkurt et al, 2007), CNNs (Rhodes, 2015;Shrestha et al, 2017), RNNs (Zhao et al, 2018;Jafariakinabad et al, 2019;Gupta et al, 2019), and transformers architectures (Fabien et al, 2020;Ordoñez et al, 2020;Uchendu et al, 2020a). Moreover, researchers have also observed a significant difference in language structure between Darknet and Surface net websites (Choshen et al, 2019;Jin et al, 2022). Therefore, exploring the application of authorship tasks on the Darknet language is crucial.…”
Section: Related Researchmentioning
confidence: 99%
“…To aid LEA, we perform supervised pre-training by conducting multiclass classification in a closed-set environment (Zhou et al, 2021a) to analyze different writing styles in text ads and classify vendor migrants to unique vendor accounts across three Darknet markets. Moreover, researchers have observed a significant difference in language structure between Darknet and Surface net websites (Choshen et al, 2019;Jin et al, 2022). Since most contextualized models are trained on surface web data, the supervised pre-training step allows our model to adapt to the Darknet market domain knowledge.…”
Section: Introductionmentioning
confidence: 99%