Joseph Marino scite author profile

Joseph Marino

4Publications

146Citation Statements Received

67Citation Statements Given

How they've been cited

103

146

How they cite others

Affiliations

California Institute of Technology, Stony Brook University, Computer Science Department

Publications

Order By: Most citations

Probabilistic Video Generation Using Holistic Attribute Control

Lehrmann

Marino

et al. 2018

View full text Add to dashboard Cite

Videos express highly structured spatio-temporal patterns of visual data. A video can be thought of as being governed by two factors: (i) temporally invariant (e.g., person identity), or slowly varying (e.g., activity), attributeinduced appearance, encoding the persistent content of each frame, and (ii) an inter-frame motion or scene dynamics (e.g., encoding evolution of the person executing the action). Based on this intuition, we propose a generative framework for video generation and future prediction. The proposed framework generates a video (short clip) by decoding samples sequentially drawn from a latent space distribution into full video frames. Variational Autoencoders (VAEs) are used as a means of encoding/decoding frames into/from the latent space and RNN as a way to model the dynamics in the latent space. We improve the video generation consistency through temporally-conditional sampling and quality by structuring the latent space with attribute controls; ensuring that attributes can be both inferred and conditioned on during learning/generation. As a result, given attributes and/or the first frame, our model is able to generate diverse but highly consistent sets of video sequences, accounting for the inherent uncertainty in the prediction task. Experimental results on Chair CAD [1], Weizmann Human Action [2], and MIT Flickr [3] datasets, along with detailed comparison to the state-of-the-art, verify effectiveness of the framework.

show abstract

Planar Visualization of Treelike Structures

Marino

Kaufman

2016

IEEE Trans. Visual. Comput. Graphics

View full text Add to dashboard Cite

We present a novel method to create planar visualizations of treelike structures (e.g., blood vessels and airway trees) where the shape of the object is well preserved, allowing for easy recognition by users familiar with the structures. Based on the extracted skeleton within the treelike object, a radial planar embedding is first obtained such that there are no self-intersections of the skeleton which would have resulted in occlusions in the final view. An optimization procedure which adjusts the angular positions of the skeleton nodes is then used to reconstruct the shape as closely as possible to the original, according to a specified view plane, which thus preserves the global geometric context of the object. Using this shape recovered embedded skeleton, the object surface is then flattened to the plane without occlusions using harmonic mapping. The boundary of the mesh is adjusted during the flattening step to account for regions where the mesh is stretched over concavities. This parameterized surface can then be used either as a map for guidance during endoluminal navigation or directly for interrogation and decision making. Depth cues are provided with a grayscale border to aid in shape understanding. Examples are presented using bronchial trees, cranial and lower limb blood vessels, and upper aorta datasets, and the results are evaluated quantitatively and with a user study.

show abstract

Improving sequential latent variable models with autoregressive flows

Marino

Chen²,

He³

et al. 2021

Mach Learn

View full text Add to dashboard Cite

We propose an approach for improving sequence modeling based on autoregressive normalizing flows. Each autoregressive transform, acting across time, serves as a moving frame of reference, removing temporal correlations and simplifying the modeling of higher-level dynamics. This technique provides a simple, generalpurpose method for improving sequence modeling, with connections to existing and classical techniques. We demonstrate the proposed approach both with standalone flow-based models and as a component within sequential latent variable models. Results are presented on three benchmark video datasets, where autoregressive flow-based dynamics improve log-likelihood performance over baseline models. Finally, we illustrate the decorrelation and improved generalization properties of using flow-based dynamics.

show abstract

Crowdsourcing for identification of polyp-free segments in virtual colonoscopy videos

Park

Mirhosseini

Nadeem

et al. 2017

View full text Add to dashboard Cite

Virtual colonoscopy (VC) allows a physician to virtually navigate within a reconstructed 3D colon model searching for colorectal polyps. Though VC is widely recognized as a highly sensitive and specific test for identifying polyps, one limitation is the reading time, which can take over 30 minutes per patient. Large amounts of the colon are often devoid of polyps, and a way of identifying these polyp-free segments could be of valuable use in reducing the required reading time for the interrogating radiologist. To this end, we have tested the ability of the collective crowd intelligence of non-expert workers to identify polyp candidates and polyp-free regions. We presented twenty short videos flying through a segment of a virtual colon to each worker, and the crowd was asked to determine whether or not a possible polyp was observed within that video segment. We evaluated our framework on Amazon Mechanical Turk and found that the crowd was able to achieve a sensitivity of 80.0% and specificity of 86.5% in identifying video segments which contained a clinically proven polyp. Since each polyp appeared in multiple consecutive segments, all polyps were in fact identified. Using the crowd results as a first pass, 80% of the video segments could in theory be skipped by the radiologist, equating to a significant time savings and enabling more VC examinations to be performed.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Joseph Marino

Probabilistic Video Generation Using Holistic Attribute Control

Planar Visualization of Treelike Structures

Improving sequential latent variable models with autoregressive flows

Crowdsourcing for identification of polyp-free segments in virtual colonoscopy videos

Contact Info

Product

Resources

About