2023
DOI: 10.1109/access.2023.3248796
|View full text |Cite
|
Sign up to set email alerts
|

Performance Improvement on Traditional Chinese Task-Oriented Dialogue Systems With Reinforcement Learning and Regularized Dropout Technique

Abstract: The development of conversational voice assistant applications has been in full swing around the world. This paper aims to develop traditional Chinese multi-domain task-oriented dialogue (TOD) systems. It is typically implemented using pipeline approach, where submodules are optimized independently, resulting in inconsistencies with each other. Instead, this paper implements end-to-end multi-domain TOD models using pre-trained deep neural networks (DNNs). This allows us to integrate all the submodules into one… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 36 publications
(49 reference statements)
0
0
0
Order By: Relevance
“…Applying Reinforcement Learning in the dialog system is the action decision of learning dialogues [9]. The action in the reinforcement learning corresponds to the next steps in the dialogues.…”
Section: Reinforcement Learning Methodsmentioning
confidence: 99%
“…Applying Reinforcement Learning in the dialog system is the action decision of learning dialogues [9]. The action in the reinforcement learning corresponds to the next steps in the dialogues.…”
Section: Reinforcement Learning Methodsmentioning
confidence: 99%
“…Reinforcement learning (RL) has emerged as a powerful approach for learning state-to-action mappings to achieve goals in various domains [1]- [2], such as conversational voice assistant [3], autonomous vehicles [4] and game [5]- [6]. In the field of communication countermeasure, deep reinforcement learning (DRL) methods have demonstrated their prowess by generating jamming or anti-jamming policies directly from multi-dimensional observation data using large-scale neural networks [7]- [11].…”
Section: Introductionmentioning
confidence: 99%