“…Several heuristics exist for change address identification. The most straightforward approach is to consider the sole new address among the two output addresses of a transaction as the change address [48].…”
Blockchain technology is relatively young but has the potential to disrupt several industries. Since the emergence of Bitcoin, also known as Blockchain 1.0, there has been significant interest in this technology. The introduction of Ethereum, or Blockchain 2.0, has expanded the types of data that can be stored on blockchain networks. The increasing popularity of blockchain technology has given rise to new challenges, such as user privacy and illicit financial activities, but has also facilitated technical advancements. Blockchain technology utilizes cryptographic hashes of user input to record transactions. The public availability of blockchain data presents a unique opportunity for academics to analyze it and gain a better understanding of the challenges in blockchain communications. Researchers have never had access to such an opportunity before. Therefore, it is crucial to highlight the research problems, accomplishments, and potential trends and challenges in blockchain network data analysis and communications. This article aims to examine and summarize the field of blockchain data analysis and communications. The review encompasses the fundamental data types, analytical techniques, architecture, and operations related to blockchain networks. Seven research challenges are addressed: entity recognition, privacy, risk analysis, network visualization, network structure, market impact, and transaction pattern recognition. The latter half of this section discusses future research directions, opportunities, and challenges based on previous research limitations.
“…Several heuristics exist for change address identification. The most straightforward approach is to consider the sole new address among the two output addresses of a transaction as the change address [48].…”
Blockchain technology is relatively young but has the potential to disrupt several industries. Since the emergence of Bitcoin, also known as Blockchain 1.0, there has been significant interest in this technology. The introduction of Ethereum, or Blockchain 2.0, has expanded the types of data that can be stored on blockchain networks. The increasing popularity of blockchain technology has given rise to new challenges, such as user privacy and illicit financial activities, but has also facilitated technical advancements. Blockchain technology utilizes cryptographic hashes of user input to record transactions. The public availability of blockchain data presents a unique opportunity for academics to analyze it and gain a better understanding of the challenges in blockchain communications. Researchers have never had access to such an opportunity before. Therefore, it is crucial to highlight the research problems, accomplishments, and potential trends and challenges in blockchain network data analysis and communications. This article aims to examine and summarize the field of blockchain data analysis and communications. The review encompasses the fundamental data types, analytical techniques, architecture, and operations related to blockchain networks. Seven research challenges are addressed: entity recognition, privacy, risk analysis, network visualization, network structure, market impact, and transaction pattern recognition. The latter half of this section discusses future research directions, opportunities, and challenges based on previous research limitations.
Deanonymization is one of the major research challenges in the Bitcoin blockchain, as entities are pseudonymous and cannot be identified from the on-chain data. Various approaches exist to identify multiple addresses of the same entity, i.e., address clustering. But it is known that these approaches tend to find several clusters for the same actor. In this work, we propose to assign a fingerprint to entities based on the dynamic graph of the taint flow of money originating from them, with the idea that we could identify multiple clusters of addresses belonging to the same entity as having similar fingerprints. We experiment with different configurations to generate substructure patterns from taint flows before embedding them using representation learning models. To evaluate our method, we train classification models to identify entities from their fingerprints. Experiments show that our approach can accurately classify entities on three datasets. We compare different fingerprint strategies and show that including the temporality of transactions improves classification accuracy and that following the flow for too long impairs performance. Our work demonstrates that out-flow fingerprinting is a valid approach for recognizing multiple clusters of the same entity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.