2018
DOI: 10.48550/arxiv.1812.00899
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Toward Scalable Neural Dialogue State Tracking Model

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
28
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(28 citation statements)
references
References 0 publications
0
28
0
Order By: Relevance
“…The performance of NBT is much better than previous DST methods. Inspired by this seminal work, a lot of neural DST approaches based on long short-term memory (LSTM) network [34,[40][41][42]59] and bidirectional gated recurrent unit (BiGRU) network [22,31,35,39,55,57] have been proposed for further improvements. These methods define DST as either a classification problem or a generation problem.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The performance of NBT is much better than previous DST methods. Inspired by this seminal work, a lot of neural DST approaches based on long short-term memory (LSTM) network [34,[40][41][42]59] and bidirectional gated recurrent unit (BiGRU) network [22,31,35,39,55,57] have been proposed for further improvements. These methods define DST as either a classification problem or a generation problem.…”
Section: Related Workmentioning
confidence: 99%
“…We adopt joint goal accuracy [34] as the evaluation metric. Joint goal accuracy is defined as the ratio of dialogue turns for which the value of each slot is correctly predicted.…”
Section: Evaluation Metricmentioning
confidence: 99%
“…MultiWOZ2.1 MultiWOZ2.0 MDBT (Ramadan et al, 2018) † -15.57% SpanPtr (Vinyals et al, 2015) -30.28% GLAD (Zhong et al, 2018) † -35.57% GCE (Nouri & Hosseini-Asl, 2018) † -36.27% HJST (Eric et al, 2019) 35.55% 38.40% DST Reader (single) 36.40% 39.41% DST Reader (ensemble) -42.12% TSCP (Lei et al, 2018) 37.12% 39.24% FJST (Eric et al, 2019) 38.00% 40.20% HyST (ensemble) 38.10% 44.24% SUMBT (Lee et al, 2019) † -46.65% TRADE (Wu et al, 2019) 45.60% 48.60% Ours 49.04% 50.52% processing in both encoding and decoding, they require much higher latency. TRADE shortens the latency by separating the decoding process among (domain, slot) pairs.…”
Section: Modelmentioning
confidence: 99%
“…To demonstrate the robustness of our model, we use the similar hyper-parameter settings for both datasets. Following the previous work (Ren et al, 2018;Zhong et al, 2018;Nouri and Hosseini-Asl, 2018), we concatenate the pre-trained GloVe embeddings (Pennington et al, 2014) and the character embeddings (Hashimoto et al, 2017) as the final word embeddings and keep them fixed when training. The epoch number of the alternate learning L, the epoch number of the generator learning N and the sampling times M for each bag are set to 5, 200 and 2 respectively.…”
Section: Implementation Detailsmentioning
confidence: 99%