Zipeng Ye scite author profile

Image retargeting techniques adjust images into different sizes and have attracted much attention recently. Objective quality assessment (OQA) of image retargeting results is often desired to automatically select the best results. Existing OQA methods train a model using some benchmarks (e.g., RetargetMe), in which subjective scores evaluated by users are provided. Observing that it is challenging even for human subjects to give consistent scores for retargeting results of different source images (diff-source-results), in this paper we propose a learning-based OQA method that trains a General Regression Neural Network (GRNN) model based on relative scores-which preserve the ranking-of retargeting results of the same source image (same-source-results). In particular, we develop a novel training scheme with provable convergence that learns a common base scalar for same-source-results. With this source specific offset, our computed scores not only preserve the ranking of subjective scores for same-source-results, but also provide a reference to compare the diff-source-results. We train and evaluate our GRNN model using human preference data collected in RetargetMe. We further introduce a subjective benchmark to evaluate the generalizability of different OQA methods. Experimental results demonstrate that our method outperforms ten representative OQA methods in ranking prediction and has better generalizability to different datasets.

show abstract

LineUp

Liu

et al. 2019

ACM Trans. Graph.

View full text Add to dashboard Cite

In this article, we introduce a novel method that can generate a sequence of physical transformations between 3D models with different shape and topology. Feasible transformations are realized on a chain structure with connected components that are 3D printed. Collision-free motions are computed to transform between different configurations of the 3D printed chain structure. To realize the transformation between different 3D models, we first voxelize these input models into a similar number of voxels. The challenging part of our approach is to generate a simple path—as a chain configuration to connect most voxels. A layer-based algorithm is developed with theoretical guarantee of the existence and the path length. We find that collision-free motion sequence can always be generated when using a straight line as the intermediate configuration of transformation. The effectiveness of our method is demonstrated by both the simulation and the experimental tests taken on 3D printed chains.

show abstract

Feature-Aware Uniform Tessellations on Video Manifold for Content-Sensitive Supervoxels

Zhao

et al. 2021

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

Over-segmenting a video into supervoxels has strong potential to reduce the complexity of downstream computer vision applications. Content-sensitive supervoxels (CSSs) are typically smaller in content-dense regions (i.e., with high variation of appearance and/or motion) and larger in content-sparse regions. In this paper, we propose to compute feature-aware CSSs (FCSSs) that are regularly shaped 3D primitive volumes well aligned with local object/region/motion boundaries in video. To compute FCSSs, we map a video to a 3-dimensional manifold embedded in a combined color and spatiotemporal space, in which the volume elements of video manifold give a good measure of the video content density. Then any uniform tessellation on video manifold can induce CSS in the video. Our idea is that among all possible uniform tessellations on the video manifold, FCSS finds one whose cell boundaries well align with local video boundaries. To achieve this goal, we propose a novel restricted centroidal Voronoi tessellation method that simultaneously minimizes the tessellation energy (leading to uniform cells in the tessellation) and maximizes the average boundary distance (leading to good local feature alignment). Theoretically our method has an optimal competitive ratio O(1), and its time and space complexities are O(N K) and O(N + K) for computing K supervoxels in an N -voxel video. We also present a simple extension of FCSS to streaming FCSS for processing long videos that cannot be loaded into main memory at once. We evaluate FCSS, streaming FCSS and ten representative supervoxel methods on four video datasets and two novel video applications. The results show that our method simultaneously achieves state-of-the-art performance with respect to various evaluation criteria.

show abstract

Predicting Personalized Head Movement From Short Video and Speech Signal

Sun

et al. 2023

IEEE Trans. Multimedia

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zipeng Ye

Fast Computation of Content-Sensitive Superpixels and Supervoxels Using Q-Distances

Ranking-Preserving Cross-Source Learning for Image Retargeting Quality Assessment

LineUp

Feature-Aware Uniform Tessellations on Video Manifold for Content-Sensitive Supervoxels

Predicting Personalized Head Movement From Short Video and Speech Signal

Contact Info

Product

Resources

About