Yu-Chin Shih scite author profile

Yu-Chin Shih

3Publications

11Citation Statements Received

39Citation Statements Given

How they've been cited

How they cite others

Affiliations

Institute of Information Science, Academia Sinica, National Taiwan University

Publications

Order By: Most citations

Subspace-based phonotactic language recognition using multivariate dynamic linear models

Lee

Shih

Wang

et al. 2013

View full text Add to dashboard Cite

Phonotactics, dealing with permissible phone patterns and their frequencies of occurrence in a specific language, is acknowledged to be related to spoken language recognition (SLR). With the assistance of phone recognizers, each speech utterance can be decoded into an ordered sequence of phone vectors filled with likelihood scores contributed by all possible phone models. In this paper, we propose a novel approach to dig the concealed phonotactic structure out of the phone-likelihood vectors through a kind of multivariate time series analysis: dynamic linear models (DLM). In these models, treating the generation of phone patterns in each utterance as a dynamic system, the relationship between adjacent vectors is linearly and time-invariantly modeled, and unobserved states are introduced to capture a temporal coherence intrinsic in the system. Each utterance expressed by the DLM is further transformed into a fixed-dimensional linear subspace so that well-developed distance measures between two subspaces can be applied to linear discriminant analysis (LDA) in a dissimilaritybased fashion. The results of SLR experiments on the OGI-TS corpus demonstrate that the proposed framework outperforms the well-known vector space modeling (VSM)-based methods and achieves comparable performance to our previous subspace-based method.

show abstract

Subspace-based feature representation and learning for language recognition

Shih

Lee

Wang

et al. 2012

View full text Add to dashboard Cite

Colorizing tags in tag cloud

Wang

Shih

et al. 2011

View full text Add to dashboard Cite

This paper presents a novel content-based query-by-tag music search system for an untagged music database. We design a new tag query interface that allows users to input multiple tags with multiple levels of preference (denoted as an MTML query) by colorizing desired tags in a web-based tag cloud interface. When a user clicks and holds the left mouse button (or presses and holds his/her finger on a touch screen) on a desired tag, the color of the tag will change cyclically according to a color map (from dark blue to bright red), which represents the level of preference (from 0 to 1). In this way, the user can easily organize and check the query of multiple tags with multiple levels of preference through the colored tags. To effect the MTML content-based music retrieval, we introduce a probabilistic fusion model (denoted as GMFM), which consists of two mixture models, namely a Gaussian mixture model and a multinomial mixture model. GMFM can jointly model the auditory features and tag labels of a song. Two indexing methods and their corresponding matching methods, namely pseudo song-based matching and tag affinity-based matching, are incorporated into the pre-learned GMFM. We evaluate the proposed system on the MajorMiner and CAL-500 datasets. The experimental results demonstrate the effectiveness of GMFM and the potential of using MTML queries to search music from an untagged music database.Tag cloud-based music query interface, MTML query, contentbased music information retrieval, probabilistic fusion model.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yu-Chin Shih

Subspace-based phonotactic language recognition using multivariate dynamic linear models

Subspace-based feature representation and learning for language recognition

Colorizing tags in tag cloud

Contact Info

Product

Resources

About