Knowledge selection plays an important role in knowledge-grounded dialogue, which is a challenging task to generate more informative responses by leveraging external knowledge. Recently, latent variable models have been proposed to deal with the diversity of knowledge selection by using both prior and posterior distributions over knowledge and achieve promising performance. However, these models suffer from a huge gap between prior and posterior knowledge selection. Firstly, the prior selection module may not learn to select knowledge properly because of lacking the necessary posterior information. Secondly, latent variable models suffer from the exposure bias that dialogue generation is based on the knowledge selected from the posterior distribution at training but from the prior distribution at inference. Here, we deal with these issues on two aspects: (1) We enhance the prior selection module with the necessary posterior information obtained from the specially designed Posterior Information Prediction Module (PIPM); (2) We propose a Knowledge Distillation Based Training Strategy (KDBTS) to train the decoder with the knowledge selected from the prior distribution, removing the exposure bias of knowledge selection. Experimental results on two knowledge-grounded dialogue datasets show that both PIPM and KDBTS achieve performance improvement over the state-of-theart latent variable model and their combination shows further improvement.
Aims We aimed to assess the association between gut bacterial biomarkers during early pregnancy and subsequent risk of gestational diabetes mellitus (GDM) in Chinese pregnant women. Methods Within the Tongji-Shuangliu Birth Cohort study, we conducted a nested case-control study among 201 incident GDM cases and 201 matched controls. Fecal samples were collected during early pregnancy (at 6-15 weeks), and GDM was diagnosed at 24-28 weeks of pregnancy. Community DNA isolated from fecal samples and V3-V4 region of 16S rRNA gene amplicon libraries were sequenced. Results In GDM cases versus controls, Rothia, Actinomyces, Bifidobacterium, Adlercreutzia, and Coriobacteriaceae, and Lachnospiraceae spp. were significantly reduced, while Enterobacteriaceae, Ruminococcaceae spp. and Veillonellaceae were over-represented. In addition, the abundance of Staphylococcus relative to Clostridium, Roseburia and Coriobacteriaceae as reference microorganisms were positively correlated with fasting blood glucose, 1-h and 2-h postprandial glucose levels. Adding microbial taxa to the base GDM prediction model with conventional risk factors increased the C-statistic significantly (P<0.001) from 0.69 to 0.75. Conclusions Gut microbiota during early pregnancy was associated with subsequent risk of GDM. Several beneficial and commensal gut microorganisms showed inverse relations with incident GDM, while opportunistic pathogenic members were related to higher risk of incident GDM and positively correlated with glucose levels on OGTT.
Recently, to incorporate external Knowledge Base (KB) information, one form of world knowledge, several end-to-end task-oriented dialog systems have been proposed. These models, however, tend to confound the dialog history with KB tuples and simply store them into one memory. Inspired by the psychological studies on working memory, we propose a working memory model (WMM2Seq) for dialog response generation. Our WMM2Seq adopts a working memory to interact with two separated long-term memories, which are the episodic memory for memorizing dialog history and the semantic memory for storing KB tuples. The working memory consists of a central executive to attend to the aforementioned memories, and a short-term storage system to store the "activated" contents from the longterm memories. Furthermore, we introduce a context-sensitive perceptual process for the token representations of the dialog history, and then feed them into the episodic memory. Extensive experiments on two task-oriented dialog datasets demonstrate that our WMM2Seq significantly outperforms the state-of-the-art results in several evaluation metrics.
In the past few years, the emergence of pre-training models has brought uni-modal fields such as computer vision (CV) and natural language processing (NLP) to a new era. Substantial works have shown that they are beneficial for downstream uni-modal tasks and avoid training a new model from scratch. So can such pre-trained models be applied to multi-modal tasks? Researchers have explored this problem and made significant progress. This paper surveys recent advances and new frontiers in vision-language pre-training (VLP), including image-text and video-text pre-training. To give readers a better overall grasp of VLP, we first review its recent advances in five aspects: feature extraction, model architecture, pre-training objectives, pre-training datasets, and downstream tasks. Then, we summarize the specific VLP models in detail. Finally, we discuss the new frontiers in VLP. To the best of our knowledge, this is the first survey focused on VLP. We hope that this survey can shed light on future research in the VLP field.
Visual dialog, which aims to hold a meaningful conversation with humans about a given image, is a challenging task that requires models to reason the complex dependencies among visual content, dialog history, and current questions. Graph neural networks are recently applied to model the implicit relations between objects in an image or dialog. However, they neglect the importance of 1) coreference relations among dialog history and dependency relations between words for the question representation; and 2) the representation of the image based on the fully represented question. Therefore, we propose a novel relation-aware graph-over-graph network (GoG) for visual dialog. Specifically, GoG consists of three sequential graphs: 1) H-Graph, which aims to capture coreference relations among dialog history; 2) History-aware Q-Graph, which aims to fully understand the question through capturing dependency relations between words based on coreference resolution on the dialog history; and 3) Questionaware I-Graph, which aims to capture the relations between objects in an image based on fully question representation. As an additional feature representation module, we add GoG to the existing visual dialogue model. Experimental results show that our model outperforms the strong baseline in both generative and discriminative settings by a significant margin.
Emotion recognition in textual conversations (ERTC) plays an important role in a wide range of applications, such as opinion mining, recommender systems, and so on. ERTC, however, is a challenging task. For one thing, speakers often rely on the context and commonsense knowledge to express emotions; for another, most utterances contain neutral emotion in conversations, as a result, the confusion between a few non-neutral utterances and much more neutral ones restrains the emotion recognition performance. In this paper, we propose a novel Knowledge Aware Incremental Transformer with Multi-task Learning (KAITML) to address these challenges. Firstly, we devise a dual-level graph attention mechanism to leverage commonsense knowledge, which augments the semantic information of the utterance. Then we apply the Incremental Transformer to encode multi-turn contextual utterances. Moreover, we are the first to introduce multi-task learning to alleviate the aforementioned confusion and thus further improve the emotion recognition performance. Extensive experimental results show that our KAITML model outperforms the state-of-the-art models across five benchmark datasets.
Autism spectrum disorder (ASD) cases have increased rapidly in recent decades, which is associated with various genetic abnormalities. To provide a better understanding of the genetic factors in ASD, we assessed the global scientific output of the related studies. A total of 2944 studies published between 1997 and 2018 were included by systematic retrieval from the Web of Science (WoS) database, whose scientific landscapes were drawn and the tendencies and research frontiers were explored through bibliometric methods. The United States has been acting as a leading explorer of the field worldwide in recent years. The rapid development of high-throughput technologies and bioinformatics transferred the research method from the traditional classic method to a big data-based pipeline. As a consequence, the focused research area and tendency were also changed, as the contribution of de novo mutations in ASD has been a research hotspot in the past several years and probably will remain one into the near future, which is consistent with the current opinions of the major etiology of ASD. Therefore, more attention and financial support should be paid to the deciphering of the de novo mutations in ASD. Meanwhile, the effective cooperation of multi-research centers and scientists in different fields should be advocated in the next step of scientific research undertaken.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.