Learning to localize and name object instances is a fundamental problem in vision, but state-of-the-art approaches rely on expensive bounding box supervision. While weakly supervised detection (WSOD) methods relax the need for boxes to that of image-level annotations, even cheaper supervision is naturally available in the form of unstructured textual descriptions that users may freely provide when uploading image content. However, straightforward approaches to using such data for WSOD wastefully discard captions that do not exactly match object names. Instead, we show how to squeeze the most information out of these captions by training a text-only classifier that generalizes beyond dataset boundaries. Our discovery provides an opportunity for learning detection models from noisy but more abundant and freely-available caption data. We also validate our model on three classic object detection benchmarks and achieve state-of-the-art WSOD performance. Our code is available at https://github. com/yekeren/Cap2Det.
We consider the problem of sampling piecewise sinusoidal signals. Classical sampling theory does not enable perfect reconstruction of such signals since they are not bandlimited. However, they can be characterized by a finite number of parameters namely the frequency, amplitude and phase of the sinusoids and the location of the discontinuities. In this paper, we show that under certain hypotheses on the sampling kernel, it is possible to perfectly recover the parameters that define the piecewise sinusoidal signal from its sampled version. In particular, we show that, at least theoretically, it is possible to recover piecewise sine waves with arbitrarily high frequencies and arbitrarily close switching points. Extensions of the method are also presented such as the recovery of combinations of piecewise sine waves and polynomials. Finally, we study the effect of noise and present a robust reconstruction algorithm that is stable down to SNR levels of 7 [dB].
Microscopy imaging often suffers from limited depth-offocus. However, the specimen can be 'optically sectioned' by moving the object along the optical axis; different areas appear in focus in different images. Extended depthof-focus is a fusion algorithm that combines those images into one single sharp composite. One promising method is based on the wavelet transform. In this paper, we show how the wavelet-based image fusion technique can be improved and easily extended to multi-channel data. First, we propose the use of complex-valued wavelet bases, which seem to outperform traditional real-valued wavelet transforms. Second, we introduce a way to apply this technique for multi-channel images that suppresses artifacts and does not introduce false colors, an important requirement for multi-channel fluorescence microscopy imaging. We evaluate our method on simulated image stacks and give results relevant to biological imaging.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.