We aim to tackle a novel vision task called Weakly Supervised Visual Relation Detection (WSVRD) to detect "subject-predicate-object" relations in an image with object relation groundtruths available only at the image level. This is motivated by the fact that it is extremely expensive to label the combinatorial relations between objects at the instance level. Compared to the extensively studied problem, Weakly Supervised Object Detection (WSOD), WSVRD is more challenging as it needs to examine a large set of regions pairs, which is computationally prohibitive and more likely stuck in a local optimal solution such as those involving wrong spatial context. To this end, we present a Parallel, Pairwise Region-based, Fully Convolutional Network (PPR-FCN) for WSVRD. It uses a parallel FCN architecture that simultaneously performs pair selection and classification of single regions and region pairs for object and relation detection, while sharing almost all computation shared over the entire image. In particular, we propose a novel position-role-sensitive score map with pairwise RoI pooling to efficiently capture the crucial context associated with a pair of objects. We demonstrate the superiority of PPR-FCN over all baselines in solving the WSVRD challenge by using results of extensive experiments over two visual relation benchmarks. arXiv:1708.01956v1 [cs.CV] 7 Aug 2017
Evaluations of Phrase & Relation DetectionComparing Methods. We evaluate the overall performance of PPR-FCN for WSVRD. We compared the following methods: 1) GroundR [35], a weakly supervised vi-
Fourier ptychographic microscopy (FPM) is a potential imaging technique, which is used to achieve wide field-of-view (FOV), high-resolution and quantitative phase information. The LED array is used to irradiate the samples from different angles to obtain the corresponding low-resolution intensity images. However, the performance of reconstruction still suffers from noise and image data redundancy, which needs to be considered. In this paper, we present a novel Fourier ptychographic microscopy imaging reconstruction method based on a deep multi-feature transfer network, which can achieve good anti-noise performance and realize high-resolution reconstruction with reduced image data. First, in this paper, the image features are deeply extracted through transfer learning ResNet50, Xception and DenseNet121 networks, and utilize the complementarity of deep multiple features and adopt cascaded feature fusion strategy for channel merging to improve the quality of image reconstruction; then the pre-upsampling is used to reconstruct the network to improve the texture details of the high-resolution reconstructed image. We validate the performance of the reported method via both simulation and experiment. The model has good robustness to noise and blurred images. Better reconstruction results are obtained under the conditions of short time and low resolution. We hope that the end-to-end mapping method of neural network can provide a neural-network perspective to solve the FPM reconstruction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.