Keita Takahashi scite author profile

This paper addresses the problem of image segmentation with a reference distribution. Recent studies have shown that segmentation with global consistency measures outperforms conventional techniques based on pixel-wise measures. However, such global approaches require a precise distribution to obtain the correct extraction. To overcome this strict assumption, we propose a new approach in which the given reference distribution plays a guiding role in inferring the latent distribution and its consistent region. The inference is based on an assumption that the latent distribution resembles the distribution of the consistent region but is distinct from the distribution of the complement region. We state the problem as the minimization of an energy function consisting of global similarities based on the Bhattacharyya distance and then implement a novel iterated distribution matching process for jointly optimizing distribution and segmentation. We evaluate the proposed algorithm on the GrabCut dataset, and demonstrate the advantages of using our approach with various segmentation problems, including interactive segmentation, background subtraction, and co-segmentation.

show abstract

TransCAIP: A Live 3D TV System Using a Camera Array and an Integral Photography Display with Interactive Control of Viewing Parameters

Taguchi

Koike

Takahashi

et al. 2009

IEEE Trans. Visual. Comput. Graphics

View full text Add to dashboard Cite

The system described in this paper provides a real-time 3D visual experience by using an array of 64 video cameras and an integral photography display with 60 viewing directions. The live 3D scene in front of the camera array is reproduced by the full-color, full-parallax autostereoscopic display with interactive control of viewing parameters. The main technical challenge is fast and flexible conversion of the data from the 64 multicamera images to the integral photography format. Based on image-based rendering techniques, our conversion method first renders 60 novel images corresponding to the viewing directions of the display, and then arranges the rendered pixels to produce an integral photography image. For real-time processing on a single PC, all the conversion processes are implemented on a GPU with GPGPU techniques. The conversion method also allows a user to interactively control viewing parameters of the displayed image for reproducing the dynamic 3D scene with desirable parameters. This control is performed as a software process, without reconfiguring the hardware system, by changing the rendering parameters such as the convergence point of the rendering cameras and the interval between the viewpoints of the rendering cameras.

show abstract

Layered light-field rendering with focus measurement

Takahashi¹,

Naemura²

2006

Signal Processing: Image Communication

View full text Add to dashboard Cite

From Focal Stack to Tensor Light-Field Display

Takahashi

Kobayashi

Fujii

2018

IEEE Trans. on Image Process.

View full text Add to dashboard Cite

We propose a method of using a focal stack, i.e., a set of differently focused images, as the input for a novel light field display called a "tensor display." Although this display consists of only a few light attenuating layers located in front of a backlight, it can be viewed from many directions (angles) simultaneously without the resolution of each viewing direction being sacrificed. Conventionally, a transmittance pattern is calculated for each layer from a light field, namely, a set of dense multi-view images (typically dozens) that are to be observed from different directions. However, preparing such a massive amount of images is often cumbersome for real objects. We developed a method that does not require a complete light field as the input; instead, a focal stack composed of only a few differently focused images is directly transformed into layer patterns. Our method greatly reduces the cost of acquiring data while also maintaining the quality of the output light field. We validated the method with experiments using synthetic light field datasets and a focal stack acquired by an ordinary camera.

show abstract

Acquiring Dynamic Light Fields Through Coded Aperture Camera

Sakai

Takahashi

Fujii

et al. 2020

View full text Add to dashboard Cite

Adversarial Patch Attacks on Monocular Depth Estimation Networks

et al. 2020

View full text Add to dashboard Cite

Thanks to the excellent learning capability of deep convolutional neural networks (CNN), monocular depth estimation using CNNs has achieved great success in recent years. However, depth estimation from a monocular image alone is essentially an ill-posed problem, and thus, it seems that this approach would have inherent vulnerabilities. To reveal this limitation, we propose a method of adversarial patch attack on monocular depth estimation. More specifically, we generate artificial patterns (adversarial patches) that can fool the target methods into estimating an incorrect depth for the regions where the patterns are placed. Our method can be implemented in the real world by physically placing the printed patterns in real scenes. We also analyze the behavior of monocular depth estimation under attacks by visualizing the activation levels of the intermediate layers and the regions potentially affected by the adversarial attack.

show abstract

Comparison of Layer Operations and Optimization Methods for Light Field Display

2020

View full text Add to dashboard Cite

A light-field display provides not only binocular depth sensation but also natural motion parallax with respect to head motion, which invokes a strong feeling of immersion. Such a display can be implemented with a set of stacked layers, each of which has pixels that can carry out light-ray operations (multiplication and addition). With this structure, the appearance of the display varies over the observed directions (i.e., a light field is produced) because the light rays pass through different combinations of pixels depending on both the originating points and outgoing directions. To display a specific 3-D scene, these layer patterns should be optimized to produce a light field that is as close as possible to that produced by the target three-dimensional scene. To deepen the understanding for this type of light field display, we focused on two important factors: light-ray operations carried out using layers and optimization methods for the layer patterns. Specifically, we compared multiplicative and additive layers, which are optimized using analytical methods derived from mathematical optimization or faster data-driven methods implemented as convolutional neural networks (CNNs). We compared combinations within these two factors in terms of the accuracy of light-field reproduction and computation time. Our results indicate that multiplicative layers achieve better accuracy than additive ones, and CNN-based methods perform faster than the analytical ones. We suggest that the best choice in terms of the balance between accuracy and computation speed is using multiplicative layers optimized using a CNN-based method.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Keita Takahashi

Learning to Capture Light Fields Through a Coded Aperture Camera

Foreground-background segmentation using iterated distribution matching

TransCAIP: A Live 3D TV System Using a Camera Array and an Integral Photography Display with Interactive Control of Viewing Parameters

Layered light-field rendering with focus measurement

From Focal Stack to Tensor Light-Field Display

Acquiring Dynamic Light Fields Through Coded Aperture Camera

Adversarial Patch Attacks on Monocular Depth Estimation Networks

Comparison of Layer Operations and Optimization Methods for Light Field Display

Contact Info

Product

Resources

About