Visual question generation aims at asking questions about an image automatically. Existing research works on this topic usually generate a single question for each given image without considering the issue of diversity. In this paper, we propose a question type driven framework to produce multiple questions for a given image with different focuses. In our framework, each question is constructed following the guidance of a sampled question type in a sequence-to-sequence fashion. To diversify the generated questions, a novel conditional variational auto-encoder is introduced to generate multiple questions with a specific question type. Moreover, we design a strategy to conduct the question type distribution learning for each image to select the final questions. Experimental results on three benchmark datasets show that our framework outperforms the state-of-the-art approaches in terms of both relevance and diversity.
Targeting delivery is a promising technique for the therapy of cancers. A molecule FA-EEYSV-NH 2 , which consists of target recognition site folic acid (FA), dipeptide linker, and peptide drug, was designed as a novel anticancer prodrug. The molecules could self-assemble into nanoparticles at pH 7.0 and nanofibers at pH 5.0. By the aid of pH-responsiveness, the self-assemblies were used purposefully as targeted vehicles of self-delivery prodrugs. The results of cell toxicity and internalization assays have proved that the self-assemblies have good cancer cell selectivity. The selection was mainly attributed to the pH-responsive structure transition of self-assemblies and the FA active-targeting effect. We hope that our work could provide a useful strategy for finely tuning the properties and activities of peptide-based supramolecular nanomaterials, thus optimizing nanomedicines with enhanced performance.
Commonsense generation aims at generating plausible everyday scenario description by means of reasoning about the concept combination. Digging the relationship of concepts from scratch does not suffices to build a reasonable scene, thus we argue editing the retrieved prototype from external knowledge corpus would benefit to discriminate the priority of different concept combination and complete the scenario with introducing additional concepts. We propose to use two kind of corpus as out of domain and in domain external knowledge to retrieve the prototypes respectively. To better model the prototypes, we design two attention mechanisms to enhance the knowledge injection procedure. We conduct experiment on CommonGen benchmark, experimental results show that our method significantly improves the performance on all the metrics.
Question generation aims to produce questions automatically given a piece of text as input. Existing research follows a sequence-to-sequence fashion that constructs a single question based on the input. Considering each question usually focuses on a specific fragment of the input, especially in the scenario of reading comprehension, it is reasonable to identify the corresponding focus before constructing the question. In this paper, we propose to identify question-worthy phrases first and generate questions with the assistance of these phrases. We introduce a multi-agent communication framework, taking phrase extraction and question generation as two agents, and learn these two tasks simultaneously via message passing mechanism. The results of experiments show the effectiveness of our framework: we can extract question-worthy phrases, which are able to improve the performance of question generation. Besides, our system is able to extract more than one question worthy phrases and generate multiple questions accordingly.
Existing research for visual captioning usually employs a CNN-RNN architecture that combines a CNN for image encoding with a RNN for caption generation, where the vocabulary is constructed from the entire training dataset as the decoding space. Such approaches typically suffer from the problem of generating N-grams which occur frequently in the training set but are irrelevant to the given image. To tackle this problem, we propose to construct an image-grounded vocabulary that leverages image semantics for more effective caption generation. More concretely, a two-step approach is proposed to construct the vocabulary by incorporating both visual information and relationships among words. Two strategies are then explored to utilize the constructed vocabulary for caption generation. One constrains the generator to select words from the image-grounded vocabulary only and the other integrates the vocabulary information into the RNN cell during the caption generation process. Experimental results on two public datasets show the effectiveness of our framework compared to state-of-the-art models. Our code is available on Github 1 .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.