This paper attacks the challenging problem of zeroexample video retrieval. In such a retrieval paradigm, an end user searches for unlabeled videos by ad-hoc queries described in natural language text with no visual example provided. Given videos as sequences of frames and queries as sequences of words, an effective sequence-to-sequence cross-modal matching is required. The majority of existing methods are concept based, extracting relevant concepts from queries and videos and accordingly establishing associations between the two modalities. In contrast, this paper takes a concept-free approach, proposing a dual deep encoding network that encodes videos and queries into powerful dense representations of their own. Dual encoding is conceptually simple, practically effective and endto-end. As experiments on three benchmarks, i.e. MSR-VTT, TRECVID 2016 and 2017 Ad-hoc Video Search show, the proposed solution establishes a new state-of-the-art for zero-example video retrieval.
Radar, as one of the sensors for human activity recognition (HAR), has unique characteristics such as privacy protection and contactless sensing. Radar-based HAR has been applied in many fields such as human–computer interaction, smart surveillance and health assessment. Conventional machine learning approaches rely on heuristic hand-crafted feature extraction, and their generalization capability is limited. Additionally, extracting features manually is time–consuming and inefficient. Deep learning acts as a hierarchical approach to learn high-level features automatically and has achieved superior performance for HAR. This paper surveys deep learning based HAR in radar from three aspects: deep learning techniques, radar systems, and deep learning for radar-based HAR. Especially, we elaborate deep learning approaches designed for activity recognition in radar according to the dimension of radar returns (i.e., 1D, 2D and 3D echoes). Due to the difference of echo forms, corresponding deep learning approaches are different to fully exploit motion information. Experimental results have demonstrated the feasibility of applying deep learning for radar-based HAR in 1D, 2D and 3D echoes. Finally, we address some current research considerations and future opportunities.
Deep learning methods have recently achieved impressive performance in the area of visual recognition and speech recognition. In this paper, we propose a handwriting recognition method based on relaxation convolutional neural network (R-CNN) and alternately trained relaxation convolutional neural network (ATR-CNN). Previous methods regularize CNN at full-connected layer or spatial-pooling layer, however, we focus on convolutional layer. The relaxation convolution layer adopted in our R-CNN, unlike traditional convolutional layer, does not require neurons within a feature map to share the same convolutional kernel, endowing the neural network with more expressive power. As relaxation convolution sharply increase the total number of parameters, we adopt alternate training in ATR-CNN to regularize the neural network during training procedure. Our previous C-NN took the 1st place in ICDAR'13 Chinese Handwriting Character Recognition Competition, while our latest ATR-CNN outperforms our previous one and achieves the state-of-the-art accuracy with an error rate of 3.94%, further narrowing the gap between machine and human observers (3.87%).
Advances in wearable epidermal sensors have revolutionized the way that physiological signals are captured and measured for health monitoring. One major challenge is to convert physiological signals to easily readable signals in a convenient way. One possibility for wearable epidermal sensors is based on visible readouts. There are a range of materials whose optical properties can be tuned by parameters such as temperature, pH, light, and electric fields. Herein, this review covers and highlights a set of materials with tunable optical properties and their integration into wearable epidermal sensors for health monitoring. Specifically, the recent progress, fabrication, and applications of these materials for wearable epidermal sensors are summarized and discussed. Finally, the challenges and perspectives for the next generation wearable devices are proposed.
Few-shot image classification aims to classify unseen classes with limited labelled samples. Recent works benefit from the metalearning process with episodic tasks and can fast adapt to class from training to testing. Due to the limited number of samples for each task, the initial embedding network for meta-learning becomes an essential component and can largely affect the performance in practice. To this end, most of the existing methods highly rely on the efficient embedding network. Due to the limited labelled data, the scale of embedding network is constrained under a supervised learning(SL) manner which becomes a bottleneck of the few-shot learning methods. In this paper, we proposed to train a more generalized embedding network with self-supervised learning (SSL) which can provide robust representation for downstream tasks by learning from the data itself. We evaluate our work by extensive comparisons with previous baseline methods on two few-shot classification datasets (i.e., MiniImageNet and CUB) and achieve better performance over baselines. Tests on four datasets in cross-domain few-shot learning classification show that the proposed method achieves state-of-theart results and further prove the robustness of the proposed model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.