Query-based moment retrieval aims to localize the most relevant moment in an untrimmed video according to the given natural language query. Existing works often only focus on one aspect of this emerging task, such as the query representation learning, video context modeling or multi-modal fusion, thus fail to develop a comprehensive system for further performance improvement. In this paper, we introduce a novel Cross-Modal Interaction Network (CMIN) to consider multiple crucial factors for this challenging task, including (1) the syntactic structure of natural language queries; (2) long-range semantic dependencies in video context and (3) the sufficient cross-modal interaction. Specifically, we devise a syntactic GCN to leverage the syntactic structure of queries for fine-grained representation learning, propose a multi-head self-attention to capture long-range semantic dependencies from video context, and next employ a multi-stage cross-modal interaction to explore the potential relations of video and query contents. The extensive experiments demonstrate the effectiveness of our proposed method. Our core code has been released at https://github.com/ikuinen/CMIN. CCS CONCEPTS• Information systems → Novelty in information retrieval. KEYWORDSQuery-based moment retrieval; syntactic GCN; multi-head selfattention; multi-stage cross-modal interaction ACM Reference Format:
In the quest for the identification of catalytic transformations to be used in chemical biology and medicinal chemistry, we identified iron(III) meso-tetraarylporphines as efficient catalysts for the reduction of aromatic azides to their amines. The reaction uses thiols as reducing agents and tolerates water, air, and other biological components. A caged fluorophore was employed to demonstrate that the reduction can be performed even in living mammalian cells. However, in vivo experiments in nematodes (Caenorhabditis elegans) and zebrafish (Danio rerio) revealed a limitation to this method: the metabolic reduction of aromatic azides.
Video moment retrieval is to search the moment that is most relevant to the given natural language query. Existing methods are mostly trained in a fully-supervised setting, which requires the full annotations of temporal boundary for each query. However, manually labeling the annotations is actually time-consuming and expensive. In this paper, we propose a novel weakly-supervised moment retrieval framework requiring only coarse video-level annotations for training. Specifically, we devise a proposal generation module that aggregates the context information to generate and score all candidate proposals in one single pass. We then devise an algorithm that considers both exploitation and exploration to select top-K proposals. Next, we build a semantic completion module to measure the semantic similarity between the selected proposals and query, compute reward and provide feedbacks to the proposal generation module for scoring refinement. Experiments on the ActivityCaptions and Charades-STA demonstrate the effectiveness of our proposed method.
A practical strategy for the generation of virtually enantiomerically pure bis-cyclometalated iridium(III) complexes is reported. Accordingly, the reactions of [Ir(μ-Cl)(ppy) 2 ] 2 (ppy = cyclometalating 2-phenylpyridine) with (S)-4-tert-butyl-2-(2Ј-hydroxyphenyl)-2-oxazoline [(S)-1a], [Ir(μ-Cl)(pq) 2 ] 2 (pq = cyclometalating 2-phenylquinoline) with (S)-2-(2Ј-hydroxyphenyl)-4-isopropyl-2-oxazoline [(S)-1b], and [Ir(μ-Cl) (pbt) 2 ] 2 (pbt = cyclometalating 2-phenylbenzothiazole) with (S)-2-(2Ј-hydroxyphenyl)-4-isopropyl-2-thiazoline [(S)-1d] afforded diastereomeric mixtures of salicyloxazolinato or sal-[a]
Two adjacent groups of midbrain dopaminergic neurons, A9 (substantia nigra pars compacta) and A10 (ventral tegmental area), have distinct projections and exhibit differential vulnerability in Parkinson's disease. Little is known about transcription factors that influence midbrain dopaminergic subgroup phenotypes or their potential role in disease. Here, we demonstrate elevated expression of the transcription factor orthodenticle homeobox 2 in A10 dopaminergic neurons of embryonic and adult mouse, primate and human midbrain. Overexpression of orthodenticle homeobox 2 using lentivirus increased levels of known A10 elevated genes, including neuropilin 1, neuropilin 2, slit2 and adenylyl cyclase-activating peptide in both MN9D cells and ventral mesencephalic cultures, whereas knockdown of endogenous orthodenticle homeobox 2 levels via short hairpin RNA reduced expression of these genes in ventral mesencephalic cultures. Lack of orthodenticle homeobox 2 in the ventral mesencephalon of orthodenticle homeobox 2 conditional knockout mice caused a reduction of midbrain dopaminergic neurons and selective loss of A10 dopaminergic projections. Orthodenticle homeobox 2 overexpression protected dopaminergic neurons in ventral mesencephalic cultures from Parkinson's disease-relevant toxin, 1-methyl-4-phenylpyridinium, whereas downregulation of orthodenticle homeobox 2 using short hairpin RNA increased their susceptibility. These results show that orthodenticle homeobox 2 is important for establishing subgroup phenotypes of post-mitotic midbrain dopaminergic neurons and may alter neuronal vulnerability.
With the rapid development of deep learning technology and improvement in computing capability, deep learning has been widely used in the field of hyperspectral image (HSI) classification. In general, deep learning models often contain many trainable parameters and require a massive number of labeled samples to achieve optimal performance. However, in regard to HSI classification, a large number of labeled samples is generally difficult to acquire due to the difficulty and time-consuming nature of manual labeling. Therefore, many research works focus on building a deep learning model for HSI classification with few labeled samples. In this article, we concentrate on this topic and provide a systematic review of the relevant literature. Specifically, the contributions of this paper are twofold. First, the research progress of related methods is categorized according to the learning paradigm, including transfer learning, active learning and few-shot learning. Second, a number of experiments with various state-of-theart approaches has been carried out, and the results are summarized to reveal the potential research directions. More importantly, it is notable that although there is a vast gap between deep learning models (that usually need sufficient labeled samples) and the HSI scenario with few labeled samples, the issues of small-sample sets can be well characterized by fusion of deep learning methods and related techniques, such as transfer learning and a lightweight model. For reproducibility, the source codes of the methods assessed in the paper can be found at https://github.com/ShuGuoJ/HSI-Classification.git.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite Inc. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.