Xianyang Song scite author profile

Xianyang Song

3Publications

4Citation Statements Received

47Citation Statements Given

How they've been cited

How they cite others

Affiliations

Northeast Forestry University

Publications

Order By: Most citations

Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch

Di¹,

Song

Zhang

et al. 2023

ACM Trans. Asian Low-Resour. Lang. Inf. Process.

View full text Add to dashboard Cite

Using off-the-shelf resources from resource-rich languages to transfer knowledge to low-resource languages has received a lot of attention. The requirements of enabling the model to achieve the reliable performance, including the scale of required annotated data and the effective framework, are not well guided. To address the first question, we empirically investigate the cost-effectiveness of several methods for training intent classification and slot-filling models from scratch in Indonesia (ID) using English data. Confronting the second challenge, we propose a Bi-Confidence-Frequency Cross-Lingual transfer framework (BiCF), which consists of “BiCF Mixing”, “Latent Space Refinement” and “Joint Decoder”, respectively, to overcome the lack of low-resource language dialogue data. BiCF Mixing based on the word-level alignment strategy generates code-mixed data by utilizing the importance-frequency and translating-confidence. Moreover, Latent Space Refinement trains a new dialogue understanding model using code-mixed data and word embedding models. Joint Decoder based on Bidirectional LSTM (BiLSTM) and Conditional Random Field (CRF) is used to obtain experimental results of intent classification and slot-filling. We also release a large-scale fine-labeled Indonesia dialogue dataset (ID-WOZ) and ID-BERT for experiments. BiCF achieves 93.56% and 85.17% (F1 score) on intent classification and slot filling, respectively. Extensive experiments demonstrate that our framework performs reliably and cost-efficiently on different scales of manually annotated Indonesian data.

show abstract

geoGAT: Graph Model Based on Attention Mechanism for Geographic Text Classification

Jing

Song

et al. 2021

ACM Trans. Asian Low-Resour. Lang. Inf. Process.

View full text Add to dashboard Cite

In the area of geographic information processing, there are few researches on geographic text classification. However, the application of this task in Chinese is relatively rare. In our work, we intend to implement a method to extract text containing geographical entities from a large number of network texts. The geographic information in these texts is of great practical significance to transportation, urban and rural planning, disaster relief, and other fields. We use the method of graph convolutional neural network with attention mechanism to achieve this function. Graph attention networks (GAT) is an improvement of graph convolutional neural networks (GCN). Compared with GCN, the advantage of GAT is that the attention mechanism is proposed to weight the sum of the characteristics of adjacent vertices. In addition, We construct a Chinese dataset containing geographical classification from multiple datasets of Chinese text classification. The Macro-F Score of the geoGAT we used reached 95% on the new Chinese dataset.

show abstract

geoGAT: Graph Model Based on Attention Mechanism for Geographic Text Classification

Jing¹,

Song²,

Di³

et al. 2021

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xianyang Song

Building Dialogue Understanding Models for Low-resource Language Indonesian from Scratch

geoGAT: Graph Model Based on Attention Mechanism for Geographic Text Classification

geoGAT: Graph Model Based on Attention Mechanism for Geographic Text Classification

Contact Info

Product

Resources

About