David Thulke scite author profile

David Thulke

17Publications

8Citation Statements Received

173Citation Statements Given

How they've been cited

How they cite others

160

172

Affiliations

RWTH Aachen University, Inform (Germany), FH Aachen

Publications

Order By: Most citations

Cascaded Span Extraction and Response Generation for Document-Grounded Dialog

Nico¹,

Thulke²,

Dugast³

et al. 2021

View full text Add to dashboard Cite

This paper summarizes our entries to both subtasks of the first DialDoc shared task which focuses on the agent response prediction task in goal-oriented document-grounded dialogs. The task is split into two subtasks: predicting a span in a document that grounds an agent turn and generating an agent response based on a dialog and grounding document. In the first subtask, we restrict the set of valid spans to the ones defined in the dataset, use a biaffine classifier to model spans, and finally use an ensemble of different models. For the second subtask, we use a cascaded model which grounds the response prediction on the predicted span instead of the full document. With these approaches, we obtain significant improvements in both subtasks compared to the baseline.

show abstract

Integration of Private and Carsharing Vehicles into Intermodal Travel Information Systems

Samsel¹,

Beutel²,

Thulke³

et al. 2017

View full text Add to dashboard Cite

Controllable Factuality in Document-Grounded Dialog Systems Using a Noisy Channel Model

Nico¹,

Thulke²,

Dugast³

et al. 2022

View full text Add to dashboard Cite

Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token

Liao¹,

Thulke²,

Hewavitharana³

et al. 2022

View full text Add to dashboard Cite

Cascaded Span Extraction and Response Generation for Document-Grounded Dialog

Nico¹,

Thulke²,

Dugast³

et al. 2021

Preprint

View full text Add to dashboard Cite

Controllable Factuality in Document-Grounded Dialog Systems Using a Noisy Channel Model

Nico¹,

Thulke²,

Dugast³

et al. 2022

Preprint

View full text Add to dashboard Cite

In this work, we present a model for documentgrounded response generation in dialog that is decomposed into two components according to Bayes' theorem. One component is a traditional ungrounded response generation model and the other component models the reconstruction of the grounding document based on the dialog context and generated response. We propose different approximate decoding schemes and evaluate our approach on multiple open-domain and task-oriented documentgrounded dialog datasets. Our experiments show that the model is more factual in terms of automatic factuality metrics than the baseline model. Furthermore, we outline how introducing scaling factors between the components allows for controlling the tradeoff between factuality and fluency in the model output. Finally, we compare our approach to a recently proposed method to control factuality in grounded dialog, CTRL (Rashkin et al., 2021), and show that both approaches can be combined to achieve additional improvements.

show abstract

Adapting Document-Grounded Dialog Systems to Spoken Conversations using Data Augmentation and a Noisy Channel Model

Thulke¹,

Nico²,

Dugast³

et al. 2021

Preprint

View full text Add to dashboard Cite

This paper summarizes our submission to Task 2 of the second track of the 10th Dialog System Technology Challenge (DSTC10) "Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations". Similar to the previous year's iteration, the task consists of three subtasks: detecting whether a turn is knowledge seeking, selecting the relevant knowledge document and finally generating a grounded response. This year, the focus lies on adapting the system to noisy ASR transcripts. We explore different approaches to make the models more robust to this type of input and to adapt the generated responses to the style of spoken conversations. For the latter, we get the best results with a noisy channel model that additionally reduces the number of short and generic responses. Our best system achieved the 1st rank in the automatic and the 3rd rank in the human evaluation of the challenge.

show abstract

Task-Oriented Document-Grounded Dialog Systems by HLTPR@RWTH for DSTC9 and DSTC10

Thulke

Nico

Dugast

et al. 2024

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

This paper summarizes our contributions to the document-grounded dialog tasks at the 9th and 10th Dialog System Technology Challenges (DSTC9 and DSTC10). In both iterations the task consists of three subtasks: first detect whether the current turn is knowledge seeking, second select a relevant knowledge document, and third generate a response grounded on the selected document. For DSTC9 we proposed different approaches to make the selection task more efficient. The best method, Hierarchical Selection, actually improves the results compared to the original baseline and gives a speedup of 24x. In the DSTC10 iteration of the task, the challenge was to adapt systems trained on written dialogs to perform well on noisy automatic speech recognition transcripts. Therefore, we proposed data augmentation techniques to increase the robustness of the models as well as methods to adapt the style of generated responses to fit well into the proceeding dialog. Additionally, we proposed a noisy channel model that allows for increasing the factuality of the generated responses. In addition to summarizing our previous contributions, in this work, we also report on a few small improvements and reconsider the automatic evaluation metrics for the generation task which have shown a low correlation to human judgments.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

David Thulke

Cascaded Span Extraction and Response Generation for Document-Grounded Dialog

Integration of Private and Carsharing Vehicles into Intermodal Travel Information Systems

Controllable Factuality in Document-Grounded Dialog Systems Using a Noisy Channel Model

Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token

Cascaded Span Extraction and Response Generation for Document-Grounded Dialog

Controllable Factuality in Document-Grounded Dialog Systems Using a Noisy Channel Model

Adapting Document-Grounded Dialog Systems to Spoken Conversations using Data Augmentation and a Noisy Channel Model

Task-Oriented Document-Grounded Dialog Systems by HLTPR@RWTH for DSTC9 and DSTC10

Contact Info

Product

Resources

About