With the widespread availability of video cameras, we are facing an ever-growing enormous collection of unedited and unstructured video data. Due to lack of an automatic way to generate summaries from this large collection of consumer videos, they can be tedious and time consuming to index or search. In this work, we propose online video highlighting, a principled way of generating short video summarizing the most important and interesting contents of an unedited and unstructured video, costly both time-wise and financially for manual processing. Specifically, our method learns a dictionary from given video using group sparse coding, and updates atoms in the dictionary on-the-fly. A summary video is then generated by combining segments that cannot be sparsely reconstructed using the learned dictionary. The online fashion of our proposed method enables it to process arbitrarily long videos and start generating summaries before seeing the end of the video. Moreover, the processing time required by our proposed method is close to the original video length, achieving quasi real-time summarization speed. Theoretical analysis, together with experimental results on more than 12 hours of surveillance and YouTube videos are provided, demonstrating the effectiveness of online video highlighting.
Exploiting the temporal dependency among video frames or subshots is very important for the task of video summarization. Practically, RNN is good at temporal dependency modeling, and has achieved overwhelming performance in many video-based tasks, such as video captioning and classi cation. However, RNN is not capable enough to handle the video summarization task, since traditional RNNs, including LSTM, can only deal with short videos, while the videos in the summarization task are usually in longer duration. To address this problem, we propose a hierarchical recurrent neural network for video summarization, called H-RNN in this paper. Speci cally, it has two layers, where the rst layer is utilized to encode short video subshots cut from the original video, and the nal hidden state of each subshot is input to the second layer for calculating its con dence to be a key subshot. Compared to traditional RNNs, H-RNN is more suitable to video summarization, since it can exploit long temporal dependency among frames, meanwhile, the computation operations are signi cantly lessened. The results on two popular datasets, including the Combined dataset and VTW dataset, have demonstrated that the proposed H-RNN outperforms the state-of-the-arts.
Maximum margin clustering (MMC) has recently attracted considerable interests in both the data mining and machine learning communities. It first projects data samples to a kernel-induced feature space and then performs clustering by finding the maximum margin hyperplane over all possible cluster labelings. As in other kernel methods, choosing a suitable kernel function is imperative to the success of maximum margin clustering. In this paper, we propose a multiple kernel clustering (MKC) algorithm that simultaneously finds the maximum margin hyperplane, the best cluster labeling, and the optimal kernel. Moreover, we provide detailed analysis on the time complexity of the MKC algorithm and also extend multiple kernel clustering to the multi-class scenario. Experimental results on both toy and real-world data sets demonstrate the effectiveness and efficiency of the MKC algorithm.
This paper presents a cutting plane algorithm for multiclass maximum margin clustering (MMC). The proposed algorithm constructs a nested sequence of successively tighter relaxations of the original MMC problem, and each optimization problem in this sequence could be efficiently solved using the constrained concave-convex procedure (CCCP). Experimental evaluations on several real world datasets show that our algorithm converges much faster than existing MMC methods with guaranteed accuracy, and can thus handle much larger datasets efficiently.
Weather Recognition plays an important role in our daily lives and many computer vision applications. However, recognizing the weather conditions from a single image remains challenging and has not been studied thoroughly. Generally, most previous works treat weather recognition as a single-label classification task, namely, determining whether an image belongs to a specific weather class or not. This treatment is not always appropriate, since more than one weather conditions may appear simultaneously in a single image. To address this problem, we make the first attempt to view weather recognition as a multi-label classification task, i.e., assigning an image more than one labels according to the displayed weather conditions. Specifically, a CNN-RNN based multi-label classification approach is proposed in this paper. The convolutional neural network (CNN) is extended with a channel-wise attention model to extract the most correlated visual features. The Recurrent Neural Network (RNN) further processes the features and excavates the dependencies among weather classes.Finally, the weather labels are predicted step by step. Besides, we construct two datasets for the weather recognition task and explore the relationships among different weather conditions. Experimental results demonstrate the superiority and effectiveness of the proposed approach. The new constructed datasets will be available at https: //github.com/wzgwzg/Multi-Label-Weather-Recognition.
Maximum margin clustering (MMC) is a recently proposed clustering method, which extends the theory of support vector machine to the unsupervised scenario and aims at finding the maximum margin hyperplane which separates the data from different classes. Traditionally, MMC is formulated as a non-convex integer programming problem and is thus difficult to solve. Several methods have been proposed in the literature to solve the MMC problem based on either semidefinite programming or alternative optimization. However, these methods are time demanding while handling large scale datasets and therefore unsuitable for real world applications. In this paper, we propose the cutting plane maximum margin clustering (CPMMC) algorithm, to solve the MMC problem. Specifically, we construct a nested sequence of successively tighter relaxations of the original MMC problem, and each optimization problem in this sequence could be efficiently solved using the constrained concave-convex procedure (CCCP). Moreover, we prove theoretically that the CPMMC algorithm takes time O(sn) to converge with guaranteed accuracy, where n is the total number of samples in the dataset and s is the average number of non-zero features, i.e. the sparsity. Experimental evaluations on several real world datasets show that CPMMC performs better than existing MMC methods, both in efficiency and accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.