This paper proposes a technique for spatiotemporal segmentation to identify the objects present in the scene represented in a video sequence. This technique processes two consecutive frames at a time. A region-merging approach is used to identify the objects in the scene. Starting from an oversegmentation of the current frame, the objects are formed by iteratively merging regions together. Regions are merged based on their mutual spatiotemporal similarity. The spatiotemporal similarity measure takes both temporal and spatial information into account, the emphasis being on the former. We propose a Modified Kolmogorov-Smirnov test for estimating the temporal similarity. This test efficiently uses temporal information in both the residual distribution and the motion parametric representation. The region-merging process is based on a weighted, directed graph. Two complementary graph-based clustering rules are proposed, namely, the strong rule and the weak rule. These rules take advantage of the natural structures present in the graph. Also, the rules take into account the possible errors and uncertainties reported in the graph. The weak rule is applied after the strong rule. Each rule is applied iteratively, and the graph is updated after each iteration. Experimental results on different types of scenes demonstrate the ability of the proposed technique to automatically partition the scene into its constituent objects.
Abstract-This paper proposes a rate-distortion optimal a posteriori quantization scheme for matching pursuit (MP) coefficients. The a posteriori quantization applies to an MP expansion that has been generated offline and cannot benefit of any feedback loop to the encoder in order to compensate for the quantization noise. The redundancy of the MP dictionary provides an indicator of the relative importance of coefficients and atom indices and, subsequently, on the quantization error. It is used to define a universal upper bound on the decay of the coefficients, sorted in decreasing order of magnitude. A new quantization scheme is then derived, where this bound is used as an Oracle for the design of an optimal a posteriori quantizer. The latter turns the exponentially distributed coefficient entropy-constrained quantization problem into a simple uniform quantization problem. Using simulations with random dictionaries, we show that the proposed exponentially upper bounded quantization (EUQ) clearly outperforms classical schemes. Stepping on the ideal Oracle-based approach, a suboptimal adaptive scheme is then designed that approximates the EUQ but still outperforms competing quantization methods in terms of rate-distortion characteristics. Finally, the proposed quantization method is studied in the context of image coding. It performs similarly to state-of-the-art coding methods (and even better at low rates) while interestingly providing a progressive stream that is very easy to transcode and adapt to changing rate constraints.
The visual efficiency of an image compression technique depends directly on the amount of visually significant information it retains. By "visually significant" we mean information to which a human observer is most sensitive. The overall sensitivity depends on aspects such as contrast, color, spatial frequency, and so forth. One important aspect is the inverse relationship between contrast sensitivity and spatial frequency. This is described by the contrast sensitivity function (CSF). In compression algorithms the CSF can be exploited to regulate the quantization step-size to minimize the visibility of compression artifacts. Existing CSF implementations for wavelet-based image compression use the same quantization step-size for a large range of spatial frequencies. This is a coarse approximation of the CSF. This paper presents two new techniques that implement the CSF at significantly higher precision, adapting even to local variations of the spatial frequencies within a decomposition subband. The approaches can be used for luminance as well as color images. For color perception three different CSFs describe the sensitivity. The implementation technique is the same for each color band. Implemented into the JPEG2000 compression standard, the new techniques are compared to conventional CSF-schemes. The proposed techniques turn out to be visually more efficient than previously published methods. However, the emphasis of this paper is on how the CSF can be implemented in a precise and locally adaptive way, and not on the superior performance of these techniques.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.