Tahir Amin scite author profile

In this paper, we study the peaky nature of wavelet coefficient distributions. The study shows that the wavelet coefficients cannot be effectively modeled by a single distribution. We then propose a new modeling scheme based on a Laplacian mixture model and apply it to the indexing and retrieval of image and video databases. In this work, the parameters of the model are first used to represent texture information in image retrieval. Then we explore its application to video retrieval. Traditionally, visual information is used for video indexing and retrieval. However, in some cases audio information is more helpful for finding clues to the video events. The proposed feature extraction scheme is based on the fundamental property of the wavelet transform. Therefore, it can also be adopted to analyze the audio contents of the video data. The experimental evaluation indicates the high discriminatory power of the proposed feature set. The dimension of the extracted feature vector is low, which is important for the retrieval efficiency of the system in terms of response time. User feedback is used to enhance the retrieval performance by modifying the system parameters according to the users' behavior. A nonlinear approach for defining the similarity between the two images is also explored in this work.Index Terms-Feature extraction, image indexing and retrieval, Laplacian mixture model, video indexing and retrieval.

show abstract

Interactive video retrieval using embedded audio content

Amin

Zeylinoght

Guan

View full text Add to dashboard Cite

A New Learning Algorithm for the Fusion of Adaptive Audio–Visual Features for the Retrieval and Classification of Movie Clips

Muneesawang

Guan

Amin

2008

J Sign Process Syst Sign Image Video Technol

View full text Add to dashboard Cite

This paper presents a new learning algorithm for audiovisual fusion and demonstrates its application to video classification for film database. The proposed system utilized perceptual features for content characterization of movie clips. These features are extracted from different modalities and fused through a machine learning process. More specifically, in order to capture the spatio-temporal information, an adaptive video indexing is adopted to extract visual feature, and the statistical model based on Laplacian mixture are utilized to extract audio feature. These features are fused at the late fusion stage and input to a support vector machine (SVM) to learn semantic concepts from a given video database. Based on our experimental results, the proposed system implementing the SVM-based fusion technique achieves high classification accuracy when applied to a large volume database containing Hollywood movies.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tahir Amin

Speech Recognition using Dynamic Time Warping

Application of Laplacian Mixture Model to Image and Video Retrieval

Interactive video retrieval using embedded audio content

A New Learning Algorithm for the Fusion of Adaptive Audio–Visual Features for the Retrieval and Classification of Movie Clips

Contact Info

Product

Resources

About