Developing Video-Grounded Dialogue Systems (VGDS), where a dialogue is conducted based on visual and audio aspects of a given video, is significantly more challenging than traditional image or text-grounded dialogue systems because (1) feature space of videos span across multiple picture frames, making it difficult to obtain semantic information; and (2) a dialogue agent must perceive and process information from different modalities (audio, video, caption, etc.) to obtain a comprehensive understanding. Most existing work is based on RNNs and sequence-to-sequence architectures, which are not very effective for capturing complex long-term dependencies (like in videos). To overcome this, we propose Multimodal Transformer Networks (MTN) to encode videos and incorporate information from different modalities. We also propose queryaware attention through an auto-encoder to extract query-aware features from non-text modalities. We develop a training procedure to simulate token-level decoding to improve the quality of generated responses during inference. We get state of the art performance on Dialogue System Technology Challenge 7 (DSTC7). Our model also generalizes to another multimodal visual-grounded dialogue task, and obtains promising performance. We implemented our models using PyTorch and the code is released at https://github. com/henryhungle/MTN.
Naive T helper cells differentiate into functionally distinct effector subsets that drive specialized immune responses. Recent studies indicate that some of the effector subsets have plasticity. Here, we used an EAE model and found that Th17 cells deficient in the transcription factor BCL11B upregulated the Th2-associated proteins GATA3 and IL-4 without decreasing RAR-related orphan receptor γ (RORγt), IL-17, and GM-CSF levels. Surprisingly, abnormal IL-4 production affected Th17 cell trafficking, diverting migration from the draining lymph nodes/CNS route to the mesenteric lymph nodes/gut route, which ameliorated EAE without overt colitis. T helper cell rerouting in EAE was dependent on IL-4, which enhanced retinoic acid (RA) production by dendritic cells, which further induced expression of gut-homing receptors CCR9 and α 4 β 7 on Bcl11b-deficient CD4 + T cells. Furthermore, IL-4 treatment or Th2 immunization of wild-type mice with EAE caused no alteration in Th17 cytokines or RORγt, but diverted T helper cell trafficking to the gut, which improved EAE outcome without overt colitis. Our data demonstrate that Th17 cells are permissive to Th2 gene expression without affecting Th17 gene expression. This Th17 plasticity has an impact on trafficking, which is a critical component of the immune response and may represent a possible avenue for treating multiple sclerosis.
We have demonstrated a multistep 2-dimensional paper network immunoassay based on controlled rehydration of patterned, dried reagents. Previous work has shown that signal enhancement improves the limit of detection in 2-dimensional paper network assays, but until now, reagents have only been included as wet or dried in separate conjugate pads placed at the upstream end of the assay device. Wet reagents are not ideal for point-of-care because they must be refrigerated and typically limit automation and require more user steps. Conjugate pads allow drying but do not offer any control of the reagent distribution upon rehydration and can be a source of error when pads do not contact the assay membrane uniformly. Furthermore, each reagent is dried on a separate pad, increasing the fabrication complexity when implementing multistep assays that require several different reagents. Conversely, our novel method allows for consistent, controlled rehydration from patterned reagent storage depots directly within the paper membrane. In this assay demonstration, four separate reagents were patterned in different regions of the assay device: a gold-antibody conjugate used for antigen detection and three different signal enhancement components that must not be mixed until immediately before use. To show the viability of patterning and drying reagents directly onto a paper device for dry reagent storage and subsequent controlled release, we tested this device with the malaria antigen Plasmodium falciparum histidine-rich protein 2 (PfHRP2) as an example of target analyte. In this demonstration, the signal enhancement step increases the visible signal by roughly 3-fold and decreases the analytical limit of detection by 2.75-fold.
Novel methods are demonstrated that enable controlled spatial and temporal rehydration of dried reagents in a porous matrix. These methods can be used in paper-based microfluidic assays to define reagent concentrations over time at zones downstream for improved performance, and can reduce costs by simplifying the manufacturing process with the use of a single porous substrate. First, the creation of uniform reagent pulses from patterned arrays of dried reagent is demonstrated. Second, reagents are stored dry in separate regions of the porous matrix so that they can be combined upon rehydration for immediate use in the device. Third, reagents are reconstituted sequentially from dry storage depots with tunable delivery times. Fourth, the total time for dissolution is varied to achieve a range of reagent delivery times to a downstream region. Finally, the utility of these control methods is demonstrated in the context of real-time reagent rehydration and mixing on a porous device.
Adaptive hypertext transfer protocol (HTTP) streaming has become a new trend to support adaptivity in video delivery. An HTTP streaming client needs to estimate exactly resource availability and resource demand. In this paper, we focus on the most important resource which is bandwidth. A new and general formulation for throughput estimation is presented taking into account previous values of instant throughput and round trip time. Besides, we introduce for the first time the use of bitrate estimation in HTTP streaming. The experiments show that our approach can effectively cope with drastic changes in connection throughput and video bitrate.
One of the core tasks in multi-view learning is to capture relations among views. For sequential data, the relations not only span across views, but also extend throughout the view length to form long-term intra-view and inter-view interactions. In this paper, we present a new memory augmented neural network model that aims to model these complex interactions between two asynchronous sequential views. Our model uses two encoders for reading from and writing to two external memories for encoding input views. The intra-view interactions and the long-term dependencies are captured by the use of memories during this encoding process. There are two modes of memory accessing in our system: late-fusion and early-fusion, corresponding to late and early inter-view interactions. In the late-fusion mode, the two memories are separated, containing only view-specific contents. In the early-fusion mode, the two memories share the same addressing space, allowing cross-memory accessing. In both cases, the knowledge from the memories will be combined by a decoder to make predictions over the output space. The resulting dual memory neural computer is demonstrated on a comprehensive set of experiments, including a synthetic task of summing two sequences and the tasks of drug prescription and disease progression in healthcare. The results demonstrate competitive performance over both traditional algorithms and deep learning methods designed for multi-view problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.