Shedding New Light on the Language of the Dark Web

Youngjin, Jin,; Jang, Eugene; Lee, Yong Jae; Shin, Seungwon; Chung, Jin-Woo

doi:10.48550/arxiv.2204.06885

Cited by 1 publication

(2 citation statements)

References 17 publications

(32 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…With the advances in NLP, there has been considerable research into the field of AA that has demonstrated the success of TFIDF-based clustering and classification techniques (Agarwal et al, 2019;İzzet Bozkurt et al, 2007), CNNs (Rhodes, 2015;Shrestha et al, 2017), RNNs (Zhao et al, 2018;Jafariakinabad et al, 2019;Gupta et al, 2019), and transformers architectures (Fabien et al, 2020;Ordoñez et al, 2020;Uchendu et al, 2020a). Moreover, researchers have also observed a significant difference in language structure between Darknet and Surface net websites (Choshen et al, 2019;Jin et al, 2022). Therefore, exploring the application of authorship tasks on the Darknet language is crucial.…”

Section: Related Researchmentioning

confidence: 99%

“…To aid LEA, we perform supervised pre-training by conducting multiclass classification in a closed-set environment (Zhou et al, 2021a) to analyze different writing styles in text ads and classify vendor migrants to unique vendor accounts across three Darknet markets. Moreover, researchers have observed a significant difference in language structure between Darknet and Surface net websites (Choshen et al, 2019;Jin et al, 2022). Since most contextualized models are trained on surface web data, the supervised pre-training step allows our model to adapt to the Darknet market domain knowledge.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

VendorLink: An NLP approach for Identifying & Linking Vendor Migrants & Potential Aliases on Darknet Markets

Saxena¹,

Rethmeier²,

Dijck³

et al. 2023

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

View full text Add to dashboard Cite

The anonymity on the Darknet allows vendors to stay undetected by using multiple vendor aliases or frequently migrating between markets. Consequently, illegal markets and their connections are challenging to uncover on the Darknet. To identify relationships between illegal markets and their vendors, we propose Ven-dorLink, an NLP-based approach that examines writing patterns to verify, identify, and link unique vendor accounts across text advertisements (ads) on seven public Darknet markets. In contrast to existing literature, VendorLink utilizes the strength of supervised pre-training to perform closed-set vendor verification, openset vendor identification, and low-resource market adaption tasks. Through VendorLink, we uncover (i) 15 migrants and 71 potential aliases in the Alphabay-Dreams-Silk dataset, (ii) 17 migrants and 3 potential aliases in the Valhalla-Berlusconi dataset, and (iii) 75 migrants and 10 potential aliases in the Traderoute-Agora dataset. Altogether, our approach can help Law Enforcement Agencies (LEA) make more informed decisions by verifying and identifying migrating vendors and their potential aliases on existing and Low-Resource (LR) emerging Darknet markets. 1

show abstract