Kunze Wang scite author profile

Kunze Wang

5Publications

31Citation Statements Received

64Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Sydney

Publications

Order By: Most citations

Detect All Abuse! Toward Universal Abusive Language Detection Models

Wang

Han³

et al. 2020

View full text Add to dashboard Cite

Online abusive language detection (ALD) has become a societal issue of increasing importance in recent years. Several previous works in online ALD focused on solving a single abusive language problem in a single domain, like Twitter, and have not been successfully transferable to the general ALD task or domain. In this paper, we introduce a new generic ALD framework, MACAS, which is capable of addressing several types of ALD tasks across different domains. Our generic framework covers multi-aspect abusive language embeddings that represent the target and content aspects of abusive language and applies a textual graph embedding that analyses the user's linguistic behaviour. Then, we propose and use the cross-attention gate flow mechanism to embrace multiple aspects of abusive language. Quantitative and qualitative evaluation results show that our ALD algorithm rivals or exceeds the six state-of-the-art ALD algorithms across seven ALD datasets covering multiple aspects of abusive language and different online community domains. The code can be downloaded from https://github.com/usydnlp/MACAS.

show abstract

InducT-GCN: Inductive Graph Convolutional Networks for Text Classification

Wang

Han

Poon

2022

View full text Add to dashboard Cite

CONDA: a CONtextual Dual-Annotated dataset for in-game toxicity understanding and detection

Weld¹,

Huang²,

Lee³

et al. 2021

View full text Add to dashboard Cite

Traditional toxicity detection models have focused on the single utterance level without deeper understanding of context. We introduce CONDA, a new dataset for in-game toxic language detection enabling joint intent classification and slot filling analysis, which is the core task of Natural Language Understanding (NLU). The dataset consists of 45K utterances from 12K conversations from the chat logs of 1.9K completed Dota 2 matches. We propose a robust dual semantic-level toxicity framework, which handles utterance and token-level patterns, and rich contextual chatting history. Accompanying the dataset is a thorough in-game toxicity analysis, which provides comprehensive understanding of context at utterance, token, and dual levels. Inspired by NLU, we also apply its metrics to the toxicity detection tasks for assessing toxicity and game-specific aspects. We evaluate strong NLU models on CONDA, providing fine-grained results for different intent classes and slot classes. Furthermore, we examine the coverage of toxicity nature in our dataset by comparing it with other toxicity datasets. 1

show abstract

ME-GCN: Multi-dimensional Edge-Embedded Graph Convolutional Networks for Semi-supervised Text Classification

Wang¹,

Han²,

Long³

et al. 2022

Preprint

View full text Add to dashboard Cite

Compared to sequential learning models, graph-based neural networks exhibit excellent ability in capturing global information and have been used for semisupervised learning tasks. Most Graph Convolutional Networks are designed with the single-dimensional edge feature and failed to utilise the rich edge information about graphs. This paper introduces the ME-GCN (Multi-dimensional Edgeenhanced Graph Convolutional Networks) for semi-supervised text classification. A text graph for an entire corpus is firstly constructed to describe the undirected and multi-dimensional relationship of word-to-word, document-document, and word-to-document. The graph is initialised with corpus-trained multi-dimensional word and document node representation, and the relations are represented according to the distance of those words/documents nodes. Then, the generated graph is trained with ME-GCN, which considers the edge features as multi-stream signals, and each stream performs a separate graph convolutional operation. Our ME-GCN can integrate a rich source of graph edge information of the entire text corpus. The results have demonstrated that our proposed model has significantly outperformed the state-of-the-art methods across eight benchmark datasets. The code is available on: https://github.com/usydnlp/ME GCN

show abstract

Understanding Graph Convolutional Networks for Text Classification

Han¹,

Yuan²,

Wang³

et al. 2022

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kunze Wang

Detect All Abuse! Toward Universal Abusive Language Detection Models

InducT-GCN: Inductive Graph Convolutional Networks for Text Classification

CONDA: a CONtextual Dual-Annotated dataset for in-game toxicity understanding and detection

ME-GCN: Multi-dimensional Edge-Embedded Graph Convolutional Networks for Semi-supervised Text Classification

Understanding Graph Convolutional Networks for Text Classification

Contact Info

Product

Resources

About