We carry out an in-depth investigation on a newly proposed Maximum F1-score Criterion (MFC) discriminative training objective function for Goodness of Pronunciation (GOP) based automatic mispronunciation detection that makes use of Gaussian Mixture Model-hidden Markov model (GMM-HMM) as acoustic models. The formulation of MFC seeks to directly optimize F1-score by converting the non-differentiable F1-score function into a continuous objective function to facilitate optimization. We present model-space training algorithm according to MFC using extended Baum-Welch form like update equations based on the weak-sense auxiliary function method. We then present MFC based feature-space discriminative training. We train a matrix projecting from posteriors of Gaussians to a normal size feature space, and add the projected features to traditional spectral features. Mispronunciation detection experiments show MFC based model-space training and feature-space training are effective in improving F1-score and other commonly used evaluation metrics. It is also shown MFC training in both the feature-space and model-space outperforms either model-space training or feature-space training alone, and is about 11.6% better than the maximum likelihood (ML) trained baseline in terms of F1-score. Further, we review and compare mispronunciation detection results with the use of MFC and some traditional training criteria that minimize word error rate in speech recognition. The experimental analysis and comparison provide useful insight into the correlations between F1-score maximization and optimization of these training criteria.
Road traffic accidents are a concrete manifestation of road traffic safety levels. The current traffic accident prediction has a problem of low accuracy. In order to provide traffic management departments with more accurate forecast data, it can be applied in the traffic management system to help make scientific decisions. This paper establishes a traffic accident prediction model based on LSTM-GBRT (long short-term memory, gradient boosted regression trees) and predicts traffic accident safety level indicators by training traffic accident-related data. Compared with various regression models and neural network models, the experimental results show that the LSTM-GBRT model has a good fitting effect and robustness. The LSTM-GBRT model can accurately predict the safety level of traffic accidents, so that the traffic management department can better grasp the situation of traffic safety levels.
Social media had a revolutionary impact because it provides an ideal platform for share information; however, it also leads to the publication and spreading of rumors. Existing rumor detection methods have relied on finding cues from only user-generated content, user profiles, or the structures of wide propagation. However, the previous works have ignored the organic combination of wide dispersion structures in rumor detection and text semantics. To this end, we propose KZWANG, a framework for rumor detection that provides sufficient domain knowledge to classify rumors accurately, and semantic information and a propagation heterogeneous graph are symmetry fused together. We utilize an attention mechanism to learn a semantic representation of text and introduce a GCN to capture the global and local relationships among all the source microblogs, reposts, and users. An organic combination of text semantics and propagating heterogeneous graphs is then used to train a rumor detection classifier. Experiments on Sina Weibo, Twitter15, and Twitter16 rumor detection datasets demonstrate the proposed model’s superiority over baseline methods. We also conduct an ablation study to understand the relative contributions of the various aspects of the method we proposed.
Named entity recognition (NER) is an important task in the processing of natural language, which needs to determine entity boundaries and classify them into pre-defined categories. For low-resource languages, most state-of-the-art systems require tens of thousands of annotated sentences to obtain high performance. However, there is minimal annotated data available about Uyghur and Hungarian (UH languages) NER tasks. There are also specificities in each task—differences in words and word order across languages make it a challenging problem. In this paper, we present an effective solution to providing a meaningful and easy-to-use feature extractor for named entity recognition tasks: fine-tuning the pre-trained language model. Therefore, we propose a fine-tuning method for a low-resource language model, which constructs a fine-tuning dataset through data augmentation; then the dataset of a high-resource language is added; and finally the cross-language pre-trained model is fine-tuned on this dataset. In addition, we propose an attention-based fine-tuning strategy that uses symmetry to better select relevant semantic and syntactic information from pre-trained language models and apply these symmetry features to name entity recognition tasks. We evaluated our approach on Uyghur and Hungarian datasets, which showed wonderful performance compared to some strong baselines. We close with an overview of the available resources for named entity recognition and some of the open research questions.
In recent years, more and more attention has been paid to text sentiment analysis, which has gradually become a research hotspot in information extraction, data mining, Natural Language Processing (NLP), and other fields. With the gradual popularization of the Internet, sentiment analysis of Uyghur texts has great research and application value in online public opinion. For low-resource languages, most state-of-the-art systems require tens of thousands of annotated sentences to get high performance. However, there is minimal annotated data available about Uyghur sentiment analysis tasks. There are also specificities in each task—differences in words and word order across languages make it a challenging problem. In this paper, we present an effective solution to providing a meaningful and easy-to-use feature extractor for sentiment analysis tasks: using the pre-trained language model with BiLSTM layer. Firstly, data augmentation is carried out by AEDA (An Easier Data Augmentation), and the augmented dataset is constructed to improve the performance of text classification tasks. Then, a pretraining model LaBSE is used to encode the input data. Then, BiLSTM is used to learn more context information. Finally, the validity of the model is verified via two categories datasets for sentiment analysis and five categories datasets for emotion analysis. We evaluated our approach on two datasets, which showed wonderful performance compared to some strong baselines. We close with an overview of the resources for sentiment analysis tasks and some of the open research questions. Therefore, we propose a combined deep learning and cross-language pretraining model for two low resource expectations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.