Phuong Pham Ngoc scite author profile

et al. 2017

Evolving Systems

negative polarity are created by the decision tree. Classifying sentiments of one English document is identified based on the association rules of the positive polarity and the negative polarity. Our English testing data set has 25,000 English documents, including 12,500 English positive reviews and 12,500 English negative reviews. We have tested our new model on our testing data set and we have achieved 60.3% accuracy of sentiment classification on this English testing data set.

Latent Semantic Analysis using a Dennis Coefficient for English Sentiment Classification in a Parallel System

2018

INT J COMPUT COMMUN

Abstract:We have already survey many significant approaches for many years because there are many crucial contributions of the sentiment classification which can be applied in everyday life, such as in political activities, commodity production, and commercial activities. We have proposed a novel model using a Latent Semantic Analysis (LSA) and a Dennis Coefficient (DNC) for big data sentiment classification in English. Many LSA vectors (LSAV) have successfully been reformed by using the DNC. We use the DNC and the LSAVs to classify 11,000,000 documents of our testing data set to 5,000,000 documents of our training data set in English. This novel model uses many sentiment lexicons of our basis English sentiment dictionary (bESD). We have tested the proposed model in both a sequential environment and a distributed network system. The results of the sequential system are not as good as that of the parallel environment. We have achieved 88.76% accuracy of the testing data set, and this is better than the accuracies of many previous models of the semantic analysis. Besides, we have also compared the novel model with the previous models, and the experiments and the results of our proposed model are better than that of the previous model. Many different fields can widely use the results of the novel model in many commercial applications and surveys of the sentiment classification.

Sentiment Classification in English Using a Self-Organizing Map Algorithm with Only the One-Dimensional Vectors and a Ruzicka Coefficient in a Parallel System

Ngoc¹,

Ngoc²

2018

IJIES

Sentiment classification has been used in many different fields because it has many significant contributions in everyday life, such as in political activities, commodity production, and commercial activities. We have proposed a new model for big data sentiment classification by using a combination of an unsupervised learning algorithm of a machine learning with a Ruzicka Coefficient (RC) in this work. A Self-Organizing Map Algorithm (SOM) of the machine learning is used in clustering the documents of the testing data set (TES) comprising 7,500,000 documents, which are the 3,750,000 positive and the 3,750,000 negative in English, into either the positive group or the negative group of our training data set (TRA) which is 3,000,000 sentences including the 1,500,000 positive sentences and the 1,500,000 negative sentences in English. In this study, we do not use a vector space modeling (VSM). We do not use any multi-dimensional vectors according to both the VSM and many sentiment lexicons. We use many sentiment lexicons of our basis English sentiment dictionary (bESD). We use many one-dimensional vectors based on the sentiment lexicons. We use a similarity coefficient in this study. We do not use any one-dimensional vectors based on the VSM. We have achieved 88.64% accuracy of the TES. The execution time of the proposed model in a distributed network environment-DNE is less than that in a sequential system-SS. Many commercial applications and surveys of the sentiment classification can widely use the results of the proposed model.

Improving Few-Shot Multi-Speaker Text-to-Speech Adaptive-Based with Extracting Mel-Vector (EMV) for Vietnamese

Int. J. As. Lang. Proc.

Quang

Luong

2022

Training a multi-speaker Text-to-Speech (TTS) model requires multiple speakers’ voices to generate an average speech model. However, the average speech synthesis model will be distorted or averaged, resulting in low quality if the new speaker’s voice has too little data to train. The existing methods require fine-tuning the model; otherwise, the model will achieve low adaptive quality. However, for synthesis voice to achieve high adaptive quality, at least thousands of fine-tuning steps are required. To solve these issues, in this paper, we propose a Vietnamese multi-speaker TTS adaptive-based technique that synthesizes high-quality speech and effectively adapts to new speakers, with two main improvements: (1) propose an Extracting Mel-Vector (EMV) architecture with three components, the Encoder–Decoder–Embedding Features, which enables complete learning of speaker features with Mel-spectrograms as input for few-shot training and (2) a continuous-learning technique called “data-distributing” preserves the new speaker’s characteristics after many training epochs. Our proposed model outperformed the baseline multi-speaker synthesis model and achieved a MOS score of 3.8/4.6 and SIM of 2.6/4 with only 1 min of the target speaker’s voice.

Adapt-Tts: High-Quality Zero-Shot Multi-Speaker Text-to-Speech Adaptive-Based for Vietnamese

Quang

Luong

2023

JCC

Current adaptive-based speech synthesis techniques are based on two main streams: 1. Fine-tuning the model using small amounts of adaptive data, and 2. Conditionally training the entire model through a speaker embedding of the target speaker. However, both of these methods require adaptive data to appear during training, which makes the training cost to generate new voices quite expensively. In addition, the traditional TTS model uses a simple loss function to reproduce the acoustic features. However, this optimization is based on incorrect distribution assumptions leading to noisy composite audio results. We introduce the Adapt-TTS model that allows high-quality audio synthesis from a small adaptive sample without training to solve these problems. Key recommendations: 1. The Extracting Mel-vector (EMV) architecture allows for a better representation of speaker characteristics and speech style; 2. An improved zero-shot model with a denoising diffusion model (Mel-spectrogram denoiser) component allows for new voice synthesis without training with better quality (less noise). The evaluation results have proven the model's effectiveness when only needing a single utterance (1-3 seconds) of the reference speaker, the synthesis system gave high-quality synthesis results and achieved high similarity.