This paper provides a comprehensive survey of the technical achievements in the research area of image retrieval, especially content-based image retrieval, an area that has been so active and prosperous in the past few years. The survey includes 100+ papers covering the research aspects of image feature representation and extraction, multidimensional indexing, and system design, three of the fundamental bases of content-based image retrieval. Furthermore, based on the state-of-the-art technology available now and the demand from real-world applications, open research issues are identified and future promising research directions are suggested. C
Water is crucial to plant growth and development. Environmental water deficiency triggers an osmotic stress signalling cascade, which induces short-term cellular responses to reduce water loss and long-term responses to remodel the transcriptional network and physiological and developmental processes. Several signalling components that have been identified by extensive genetic screens for altered sensitivities to osmotic stress seem to function downstream of the perception of osmotic stress. It is known that hyperosmolality and various other stimuli trigger increases in cytosolic free calcium concentration ([Ca(2+)]i). Considering that in bacteria and animals osmosensing Ca(2+) channels serve as osmosensors, hyperosmolality-induced [Ca(2+)]i increases have been widely speculated to be involved in osmosensing in plants. However, the molecular nature of corresponding Ca(2+) channels remain unclear. Here we describe a hyperosmolality-gated calcium-permeable channel and its function in osmosensing in plants. Using calcium-imaging-based unbiased forward genetic screens we isolated Arabidopsis mutants that exhibit low hyperosmolality-induced [Ca(2+)]i increases. These mutants were rescreened for their cellular, physiological and developmental responses to osmotic stress, and those with clear combined phenotypes were selected for further physical mapping. One of the mutants, reduced hyperosmolality-induced [Ca(2+)]i increase 1 (osca1), displays impaired osmotic Ca(2+) signalling in guard cells and root cells, and attenuated water transpiration regulation and root growth in response to osmotic stress. OSCA1 is identified as a previously unknown plasma membrane protein and forms hyperosmolality-gated calcium-permeable channels, revealing that OSCA1 may be an osmosensor. OSCA1 represents a channel responsible for [Ca(2+)]i increases induced by a stimulus in plants, opening up new avenues for studying Ca(2+) machineries for other stimuli and providing potential molecular genetic targets for engineering drought-resistant crops.
Automatically describing video content with natural language is a fundamental challenge of multimedia. Recurrent Neural Networks (RNN), which models sequence dynamics, has attracted increasing attention on visual interpretation. However, most existing approaches generate a word locally with given previous words and the visual content, while the relationship between sentence semantics and visual content is not holistically exploited. As a result, the generated sentences may be contextually correct but the semantics (e.g., subjects, verbs or objects) are not true.This paper presents a novel unified framework, named Long Short-Term Memory with visual-semantic Embedding (LSTM-E), which can simultaneously explore the learning of LSTM and visual-semantic embedding. The former aims to locally maximize the probability of generating the next word given previous words and visual content, while the latter is to create a visual-semantic embedding space for enforcing the relationship between the semantics of the entire sentence and visual content. Our proposed LSTM-E consists of three components: a 2-D and/or 3-D deep convolutional neural networks for learning powerful video representation, a deep RNN for generating sentences, and a joint embedding model for exploring the relationships between visual content and sentence semantics. The experiments on YouTube2Text dataset show that our proposed LSTM-E achieves to-date the best reported performance in generating natural sentences: 45.3% and 31.0% in terms of BLEU@4 and METEOR, respectively. We also demonstrate that LSTM-E is superior in predicting Subject-Verb-Object (SVO) triplets to several state-of-the-art techniques.
Automatically annotating concepts for video is a key to semantic-level video browsing, search and navigation. The research on this topic evolved through two paradigms. The first paradigm used binary classification to detect each individual concept in a concept set. It achieved only limited success, as it did not model the inherent correlation between concepts, e.g., urban and building. The second paradigm added a second step on top of the individual-concept detectors to fuse multiple concepts. However, its performance varies because the errors incurred in the first detection step can propagate to the second fusion step and therefore degrade the overall performance. To address the above issues, we propose a third paradigm which simultaneously classifies concepts and models correlations between them in a single step by using a novel Correlative Multi-Label (CML) framework. We compare the performance between our proposed approach and the state-of-the-art approaches in the first and second paradigms on the widely used TRECVID data set. We report superior performance from the proposed approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.