Visual signals in a video can be divided into content and motion. While content specifies which objects are in the video, motion describes their dynamics. Based on this prior, we propose the Motion and Content decomposed Generative Adversarial Network (MoCoGAN) framework for video generation. The proposed framework generates a video by mapping a sequence of random vectors to a sequence of video frames. Each random vector consists of a content part and a motion part. While the content part is kept fixed, the motion part is realized as a stochastic process. To learn motion and content decomposition in an unsupervised manner, we introduce a novel adversarial learning scheme utilizing both image and video discriminators. Extensive experimental results on several challenging datasets with qualitative and quantitative comparison to the state-of-theart approaches, verify effectiveness of the proposed framework. In addition, we show that MoCoGAN allows one to generate videos with same content but different motion as well as videos with different content and same motion.
Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags. However, real-world annotations are often diverse in form, which challenges these existing works. In this paper, we present UFO 2 , a unified object detection framework that can handle different forms of supervision simultaneously. Specifically, UFO 2 incorporates strong supervision (e.g., boxes), various forms of partial supervision (e.g., class tags, points, and scribbles), and unlabeled data. Through rigorous evaluations, we demonstrate that each form of label can be utilized to either train a model from scratch or to further improve a pre-trained model. We also use UFO 2 to investigate budget-aware omni-supervised learning, i.e., various annotation policies are studied under a fixed annotation budget: we show that competitive performance needs no strong labels for all data. Finally, we demonstrate the generalization of UFO 2 , detecting more than 1,000 different objects without bounding box annotations.
Raw sonar images may not be used for underwater detection or recognition directly because disturbances such as the grating-lobe and multi-path disturbance affect the gray-level distribution of sonar images and cause phantom echoes. To search for a more robust segmentation method with a reasonable computational cost, a prior-knowledge-based threshold segmentation method of underwater linear object detection is discussed. The possibility of guiding the segmentation threshold evolution of forward-looking sonar images using prior knowledge is verified by experiment. During the threshold evolution, the collinear relation of two lines that correspond to double peaks in the voting space of the edged image is used as the criterion of termination. The interaction is reflected in the sense that the Hough transform contributes to the basis of the collinear relation of lines, while the binary image generated from the current threshold provides the resource of the Hough transform. The experimental results show that the proposed method could maintain a good tradeoff between the segmentation quality and the computational time in comparison with conventional segmentation methods. The proposed method redounds to a further process for unsupervised underwater visual understanding.
Among various image fusion methods, principal component analysis (PCA) technique is capable of quickly merging the massive volumes of data. For IKONOS imagery, PCA can yield satisfactory "spatial" enhancement but may introduce spectral distortion, appearing as a change in colors between compositions of resembled and fused multi-spectral bands. To solve this problem, a fast improved PCA fusion method based on Intensity-Hue-Saturation Fusion Technique with Spectral Adjustment is presented. The experimental results demonstrate that the proposed approach can provide better performance than the original PCA method both in processing speed and image quality.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.